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PREFACE TO THIRD EDITION 

The book has again been mostly rewritten to bring in various 
improvements. The chief of these is the use of the notation of bra 
and ket vectors, which I have developed since 1939. This notation 
allows a more direct connexion to be made between the formalism 
in terms of the abstract quantities corresponding to states and 
observables and the formalism in terms of representatives—in fact 
the two formalisms become welded into a single comprehensive 
scheme. With the help of this notation several of the deductions in 
the book take a simpler and neater form. 

Other substantial alterations include: 

(i) A new presentation of the theory of systems with similar 
particles, based on Fock’s treatment of the theory of radiation 
adapted to the present notation. This treatment is simpler and more 
powerful than the one given in earlier editions of the book. 

(ii) A further development of quantum electrodynamics, including 
the theory of the Wentzel field. The theory of the electron in inter¬ 
action with the electromagnetic field is carried as far as it can be at 
the present time without getting on to speculative ground. 

P. A. M. D. 


ST. JOHN’S COLLEGE, CAMBRIDGE 

21 April 1947 



FROM THE 

PREFACE TO THE SECOND EDITION 

The book has been mostly rewritten. I have tried by carefully over¬ 
hauling the method of presentation to give the development of the 
theory in a rather less abstract form, without making any sacrifices 
in exactness of expression or in the logical character of the develop¬ 
ment. This should make the work suitable for a wider circle of 
readers, although the reader who likes abstractness for its own sake 
may possibly prefer the style of the first edition. 

The main change has been brought about by the use of the word 
'state 5 in a three-dimensional non-relativistic sense. It would seem 
at first sight a pity to build up the theory largely on the basis of non- 
relativistic concepts. The use of the non-relativistic meaning of 
'state 5 , however, contributes so essentially to the possibilities of 
clear exposition as to lead one to suspect that the fundamental ideas 
of the present quantum mechanics are in need of serious alteration at 
just this point, and that an improved theory would agree more closely 
with the development here given than with a development which 
aims at preserving the relativistic meaning of 'state 5 throughout. 

P. A. M. D. 


THE INSTITUTE 3TOR ADVANCED STUDY 
PRINCETON 

27 November 1934 



FROM THE 

PREFACE TO THE FIRST EDITION 

The methods of progress in theoretical physics have undergone a 
vast change during the present century. The classical tradition 
has been to consider the world to be an association of observable 
objects (particles, fluids, fields, etc.) moving about according to 
definite laws of force, so that one could form a mental picture in 
space and time of the whole scheme. This led to a physics whose aim 
was to make assumptions about the mechanism and forces connecting 
these observable objects, to account for their behaviour in the 
simplest possible way. It has become increasingly evident in recent 
times, however, that nature works on a different plan. Her funda¬ 
mental laws do not govern the world as it appears in our mental 
picture in any very direct way, but instead they control a substra¬ 
tum of which we cannot form a mental picture without intro¬ 
ducing irrelevancies. The formulation of these laws requires the use 
of the mathematics of transformations. The important things in 
the world appear as the invariants (or more generally the nearly 
invariants, or quantities with simple transformation properties) 
of these transformations. The things we are immediately aware of 
are the relations of these nearly invariants to a certain frame of 
reference, usually one chosen so as to introduce special simplifying 
features which are unimportant from the point of view of general 
theory. 

The growth of the use of transformation theory, as applied first to 
relativity and later to the quantum theory, is the essence of the new 
method in theoretical physics. Further progress lies in the direction 
of making our equations invariant under wider and still wider trans¬ 
formations. This state of affairs is very satisfactory from a philo¬ 
sophical point of view, as implying an increasing recognition of the 
part played by the observer in liimself introducing the regularities 
that appear in his observations, and a lack of arbitrariness in the ways 
of nature, but it makes things less easy for the learner of physics. 
The new theories, if one looks apart from their mathematical setting, 
are built up from physical concepts which cannot be explained in 
terms of things previously known to the student, which cannot even 
be explained adequately in words at all. Like the fundamental con¬ 
cepts (e.g. proximity, identity) which every one must learn on his 
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arrival into the world, the newer concepts of physics can be mastered 
only by long familiarity with their properties and uses. 

From the mathematical side the approach to the new theories 
presents no difficulties, as the mathematics required (at any rate that 
which is required for the development of physics .up to the present) 
is not essentially different from what has been current for a consider¬ 
able time. Mathematics is the tool specially suited for dealing with 
abstract concepts of any kind and there is no limit to its power in this 
field. For this reason a book on the new physics, if not purely descrip¬ 
tive of experimental work, must be essentially mathematical. All the 
same the mathematics is only a tool and one should learn to hold the 
physical ideas in one’s mind without reference to the mathematical 
form. In this book I have tried to keep the physics to the forefront, 
by begin nin g with an entirely physical chapter and in the later work 
examini n g the physical meaning underlying the formalism wherever 
possible. The amount of theoretical ground one has to cover before 
being able to solve problems of real practical value is rather large, but 
this circumstance is an inevitable consequence of the fundamental 
part played by transformation theory and is likely to become more 
pronounced in the theoretical physics of the future. 

With regard to the mathematical form in which the theory can be 
presented, an author must decide at the outset between two methods. 
There is the symbolic method, which deals directly in an abstract way 
with the quantities of fundamental importance (the invariants, etc., 
of the transformations) and there is the method of coordinates or 
representations, which deals with sets of numbers corresponding to 
these quantities. The second of these has usually been used for the 
presentation of quantum mechanics (in fact it has been used practi¬ 
cally exclusively with the exception of Weyl’s book Gruppentheorie 
und Quantenmechanik). It is known under one or other of the two 
names ‘Wave Mechanics’ and ‘Matrix Mechanics’ according to which 
physical things receive emphasis in the treatment, the states of a 
system or its dynamical variables. It has the advantage that the kind 
of mathematics required is more familiar to the average student, and 
also it is the historical method. 

The symbolic method, however, seems to go more deeply into the 
nature of things. It enables one to exnress the physical laws in a neat 
and concise way, and will probably be increasingly used in the future 
as it becomes better understood and its own special mathematics gets 
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developed. For this reason I have chosen the symbolic method, 
introducing the representatives later merely as an aid to practical 
calculation. This has necessitated a complete break from the histori¬ 
cal line of development, but this break is an advantage through 
enabling the approach to the new ideas to be made as direct as 
possible. 


st. John’s college, Cambridge 
29 May 1930 


P. A. M. D. 
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THE PRINCIPLE OF SUPERPOSITION 


L The need for a quantum theory 

Classical mechanics lias been developed continuously from the time 
of Newton and applied to an ever-widening range of dynamical 
systems, including the electromagnetic field in interaction with 
matter. The underlying ideas and the Jaws governing their applica¬ 
tion form a simple and elegant scheme, which one would be inclined 
to think could not bo seriously modified without having all its 
attractive features spoilt. Nevertheless it has been found possible to 
set up a new scheme, called quantum mechanics, which is more 
suitable for the description of phenomena on the atomic scale and 
which is in. some respects more elegant and satisfying than the 
classical scheme. This possibility is due to the changes which the 
new scheme involves being of a very profound character and not 
clashing with the features of the classical theory that make it so 
attractive, as a result of which all these, features can be incorporated 
in the new scheme. 

The necessity fur a departure from classical mechanics is clearly 
shown by (experimental results. In the first pla.ee. the forces known 
in classical elect rodynamics are inadequate fur the explanation of the 
remarkable stabiliU uf atoms and molecules, which is necessary in 
order that materials may have any definite physical and chemical 
properties at all. The introduction of new hypothetical forces will not 
save the situation, since there exist general principles of classical 
mechanics, holding for all kinds of forces, leading to results in direct 
disagreement with observation. For example, if an atomic system has 
Itsequilihrium disturbed in .any way and is then left alone, it will beset 
in oscillation and tin* oscillations will got. impressed on the surround¬ 
ing elect roman net ie held, so that I heir frequencies may he observed 
with a spectroscope. Now whatever the laws of force governing the 
equilibrium, one would expect to be able to include the various fre¬ 
quencies in a scheme conqirisiii!r etuiaiu fundamental frequencies and 
their harmonies, This is not observed to be the ease. Instead, there 
Ls observed a new and unexpected connexion between t he frequencies, 
called Kit'/.V < hmhimilion 1 mu of Spec! ruscopy. according to which all 
the frequencies can hr expressed as differences bet ween certain Imams, 
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THE PRINCIPLE OF SUPERPOSITION § 1 

the number of terms being much less than the number of frequencies. 
This law is quite unintelligible from the classical standpoint. 

One might try to get over the difficulty without departing from 
classical mechanics by assuming each of the spectroscopically ob¬ 
served frequencies to be a fundamental frequency with its own degree 
of freedom, the laws of force being such that the harmonic vibrations 
do not occur. Such a theory will not do, however, even apart from 
the fact that it would give no explanation of the Combination Law, 
since it would immediately bring one into conflict with the experi¬ 
mental evidence on specific heats. Classical statistical mechanics 
enables one to establish a general connexion between the total number 
of degrees of freedom of an assembly of vibrating systems and its 
specific heat. If one assumes all the spectroscopic frequencies of an 
atom to correspond to different degrees of freedom, one would get a 
specific heat for any kind of matter very much greater than the 
observed value. In fact the observed specific heats at ordinary 
temperatures are given fairly well by a theory that takes into account 
merely the motion of each atom as a whole and assigns no internal 
motion to it at all. 

This leads us to a new clash between classical mechanics and the 
results of experiment. There must certainly be some internal motion 
in an atom to account for its spectrum, but the internal degrees of 
freedom, for some classically inexplicable reason, do not contribute 
to the specific heat. A similar clash is found in connexion with the 
energy of oscillation of the electromagnetic field in a vacuum. Classical 
mechanics requires the specific heat corresponding to this energy to 
be infinite, but it is observed to be quite finite. A general conclusion 
from experimental results is that oscillations of high frequency do 
not contribute their classical quota to the specific heat. 

As another illustration of the failure of classical mechanics we may 
consider the behaviour of light. We have, on the one hand, the 
phenomena of interference and diffraction, which can be explained 
only on the basis of a wave theory; on the other, phenomena such as 
photo-electric emission and scattering by free electrons, which show 
that light is composed of small particles. These particles, which 
are called photons, have each a definite energy and momentum, de¬ 
pending on the frequency of the light, and appear to have just as 
real an existence as electrons, or any other particles known in physics, 
A fraction of a photon is never observed. 
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Experiments have shown that this anomalous behaviour is not 
peculiar to light, but is quite general. All material particles have 
wave properties, which can be exhibited under suitable conditions. 
We have here a very striking and general example of the breakdown 
of classical mechanics—not merely an inaccuracy in its laws of motion, 
but an inadequacy of its concepts to supply us with a description of 
atomic events . 

The necessity to depart from classical ideas when one wishes to 
account for the ultimate structure of matter may be seen, not only 
from experimentally established facts, but also from general philo¬ 
sophical grounds. In a classical explanation of the constitution of 
matter, one would assume it to be made up of a large number of small 
constituent parts and one would postulate laws for the behaviour of 
these parts, from which the laws of the matter in bulk could be de¬ 
duced. This would not complete the explanation, however, since the 
question of the structure and stability of the constituent parts is left 
untouched. To go into this question, it becomes necessary to postu¬ 
late that each constituent part is itself made up of smaller parts, in 
terms of which its behaviour is to be explained. There is clearly no 
end to this procedure, so that one can never arrive at the ultimate 
structure of matter on these lines. So long as big and small are merely 
relative concepts, it is no help to explain the big in terms of the small. 
It is therefore necessary to modify classical ideas in such a way as to 
give an absolute meaning to size. 

At this stage it becomes important to remember that science is 
concerned only with observable things and that we can observe an 
object only by letting it interact with some outside influence. An act 
of observation is thus necessarily accompanied by some disturbance 
of the object observed. We may define an object to be big when the 
disturbance accompanying our observation of it may be neglected, 
and small when the disturbance cannot be neglected. This definition 
is in close agreement with the common meanings of big and small. 

It is usually assumed that, by being careful, we may cut down the 
disturbance accompanying our observation to any desired extent. 
The concepts of big and small are then purely relative and refer to the 
gentleness of our means of observation as well as to the object being 
described. In order to give an absolute meaning to size, such as is 
required for any theory of the ultimate structure of matter, we have 
to assume that there is a limit to the fineness of our powers of observation 
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and the smallness of the dccompanying disturbance—a limit which is 
inherent in the nature of things and can never be surpassed by improved 
technique or increased skill on the part of the observer. If the object under 
observation is such that the unavoidable limiting disturbance is negli¬ 
gible, then the object is big in the absolute sense and we may apply 
classical mechanics to it. If, on the other hand, the limiting dis¬ 
turbance is not negligible, then the object is small in the absolute 
sense and we require a new theory for dealing with it. 

A consequence of the preceding discussion is that we must revise 
our ideas of causality. Causality applies only to a system which is 
left undisturbed. If a system is small, we cannot observe it without 
producing a serious disturbance and hence we cannot expect to find 
any causal connexion between the results of our observations. 
Causality will still be assumed to apply to undisturbed systems and 
the equations which will be set up to describe an undisturbed system 
will be differential equations expressing a causal connexion between 
conditions at one time and conditions at a later time. These equations 
will be in close correspondence with the equations of classical 
mechanics, but they will be connected only indirectly with the results 
of observations. There is an unavoidable indeterminacy in the calcu¬ 
lation of observational results, the theory enabling us to calculate in 
general only the probability of our obtaining a particular result when 
we make an observation. 

2. The polarization of photons 

The discussion in the preceding section about the limit to the 
gentleness with which observations can be made and the consequent 
indeterminacy in the results of those observations does not provide 
any quantitative basis for the building up of quantum mechanics. 
For this purpose a new set of accurate laws of nature is required. 
One of the most fundamental and most drastic of these is the Principle 
of Superposition of States. We shall lead up to a general formulation 
of this principle through a consideration of some special cases, taking 
first the example provided by the polarization of light. 

It is known experimentally that when plane-polarized light is used 
for ejecting photo-electrons, there is a preferential direction for the 
electron emission. Thus the polarization properties of light are closely 
connected with its corpuscular properties and one must ascribe a 
polarization to the photons. One must consider, for instance, a beam 
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of light plane-polarized in a certain direction as consisting of photons 
each of which is plane-polarized in that direction and a beam of 
circularly polarized light as consisting of photons each circularly 
polarized. Every photon is in a certain state of polarization, as we 
shall say. The problem we must now consider is how to fit in these 
ideas with the known facts about the resolution of light into polarized 
components and the recombination of these components. 

Let us take a definite case. Suppose we have a beam of fight passing 
through a crystal of tourmaline, which has the property of letting 
through only light plane-polarized perpendicular to its optic axis. 
Classical electrodynamics tells us what will happen for any given 
polarization of the incident beam. If this beam is polarized per¬ 
pendicular to the optic axis, it will all go through the crystal; if 
parallel to the axis, none of it will go through; while if polarized at 
an angle a to the axis, a fraction sin 2 a will go through. How are we 
to understand these results on a photon basis ? 

A beam that is plane-polarized in a certain direction is to be 
pictured as made up of photons each plane-polarized in that 
direction. This picture leads to no difficulty in the eases when our 
incident beam, is polarized perpendicular or parallel, to the optic axis. 
We merely have to suppose that each photon polarized perpendicular 
to the axis passes unhindered and unchanged through the crystal, 
while each photon polarized parallel to the axis is stopped and ab¬ 
sorbed. A difficulty arises, however, in the case of the obliquely 
polarized incident beam. Each of the incident photons is then 
obliquely polarized and it is not clear what will happen to such a 
photon when it reaches the tourmaline. 

A question about what will happen to a particular photon under 
certain conditions is not really very precise. To make it precise one 
must imagine some experiment performed having a bearing on the 
question and inquire what will bo the result of the experiment. Only 
questions about the results of experiments have a real significance 
and it is only such questions that theoretical physics has to consider. 

In our present example the obvious experiment is to use an incident 
beam consisting of only a single photon and to observe what appears 
on the back side of the crystal. According to quantum mechanics 
the result of this experiment will be that sometimes one will find a 
whole photon, of energy equal to the energy of the incident photon, 
on the back side and other times one will find nothing. When one 
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finds a whole photon, it will be polarized perpendicular to the optic 
axis. One will never find only a part of a photon on the back side. 
If one repeats the experiment a large number of times, one will find 
the photon on the back side in a fraction sin 2 a of the total number 
of times. Thus we may say that the photon has a probability sin 2 a 
of passing through the tourmaline and appearing on the back side 
polarized perpendicular to the axis and a probability cos 2 a of being 
absorbed. These values for the probabilities lead to the correct 
classical results for an incident beam containing a large number of 
photons. 

In this way we preserve the individuality of the photon in all 
cases. We are able to do this, however, only because we abandon the 
determinacy of the classical theory. The result of an experiment is 
not determined, as it would be according to classical ideas, by the 
conditions under the control of the experimenter. The most that can 
be predicted is a set of possible results, with a probability of occur¬ 
rence for each. 

The foregoing discussion about the result of an experiment with a 
single obliquely polarized photon incident on a crystal of tourmaline 
answers all that can legitimately be asked about what happens to an 
obliquely polarized photon when it reaches the tourmaline. Questions 
about what decides whether the photon is to go through or not and 
how it changes its direction of polarization when it does go through 
cannot be investigated by experiment and should be regarded as 
outside the domain of science. Nevertheless some further description 
is necessary in order to corr elate the results of this experiment with 
the results of other experiments that might be performed with 
photons and to fit them all into a general scheme. Such further 
description should be regarded, not as an attempt to answer questions 
outside the domain of science, but as an aid to the formulation of 
rules for expressing concisely the results of large numbers of experi¬ 
ments. 

The further description provided by quantum mechanics runs as 
follows. It is supposed that a photon polarized obliquely to the optic 
axis may be regarded as being partly in the state of polarization 
parallel to the axis and partly in the state of polarization perpen¬ 
dicular to the axis. The state of oblique polarization may be con¬ 
sidered as the result of some kind of superposition process applied to 
the two states of parallel and perpendicular polarization. This implies 
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a certain special kind of relationship between the various states of 
polarization, a relationship similar to that between polarized beams in 
classical optics, but which is now to be applied, not to beams, but to 
the states of polarization of one particular photon. This relationship 
allows any state of polarization to be resolved into, or expressed as a 
superposition of, any two mutually perpendicular states of polari¬ 
zation. 

When we make the photon meet a tourmaline crystal, we are sub¬ 
jecting it to an observation. We are observing whether it is polarized 
parallel or perpendicular to the optic axis. The effect of making this 
observation is to force the photon entirely into the state of parallel 
or entirely into the state of perpendicular polarization. It has to 
make a sudden jump from being partly in each of these two states to 
being entirely in one or other of them. Which of the two states it will 
jump into cannot be predicted, but is governed only by probability 
laws. If it jumps into the parallel state it gets absorbed and if it 
jumps into the perpendicular state it passes through the crystal and 
appears on the other side preserving this state of polarization. 

3. Interference of photons 

In this section we shall deal with another example of superposition. 
We shall again take photons, but shall be concerned with their posi¬ 
tion in space and their momentum instead of their polarization. If 
we are given a beam of roughly monochromatic light, then we know 
something about the location and momentum of the associated 
photons. We know that each of them is located somewhere in the 
region of space through which the beam is passing and has a momen¬ 
tum in the direction of the beam of magnitude given in terms of the 
frequency of the beam by Einstein’s photo-electric law—momentum 
equals frequency multiplied by a universal constant. When we have 
such information about the location and momentum of a photon we 
shall say that it is in a definite translational state . 

We shall discuss the description which quantum mechanics pro¬ 
vides of the interference of photons. Let us take a definite experi¬ 
ment demonstrating interference. Suppose we have a beam of light 
which is passed through some kind of interferometer, so that it gets 
split up into two components and the two components are subse¬ 
quently made to interfere. We may, as in the preceding section, take 
an incident beam consisting of only a single photon and inquire what 
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will happen to it as it goes through the apparatus. This will present 
to us the difficulty of the conflict between the wave and corpuscular 
theories of light in an acute form. 

Corresponding to the description that we had in the case of the 
polarization, we must now describe the photon as going partly into 
each of the two components into which the incident beam is split. 
The photon is then, as we may say, in a translational state given by the 
superposition of the two translational states associated with the two 
components. We are thus led to a generalization of the term 'trans¬ 
lational state’ applied to a photon. For a photon to be in a definite 
translational state it need not be associated with one single beam of 
light, but may be associated with two or more beams of light which 
are the components into which one original beam has been split, f In 
the accurate mathematical theory each translational state is associated 
with one of the wave functions of ordinary wave optics, which wave 
function may describe either a single beam or two or more beams 
into which one original beam has been split. Translational states are 
thus superposable in a similar way to wave functions. 

Let us consider now what happens when we determine the energy 
in one of the components. The result of such a determination must 
be either the whole photon or nothing at all. Thus the photon must 
change suddenly from being partly in one beam and partly in the 
other to being entirely in one of the beams. This sudden change is 
due to the disturbance in the translational state of the photon which 
the observation necessarily makes. It is impossible to predict in which 
of the two beams the photon will be found. Only the probability of 
either result can be calculated from the previous distribution of the 
photon over the two beams. 

One could carry out the energy measurement without destroying the 
component beam by, for example, reflecting the beam from a movable 
mirror and observing the recoil. Our description of the photon allows 
us to infer that, after such an energy measurement, it would not be 
possible to bring about any interference effects between the two com¬ 
ponents. So long as the photon is partly in one beam and partly in 
the other, interference can occur when the two beams are superposed, 
but this possibility disappears when the photon is forced entirely into 

f The circumstance that the superposition idea requires us to generalize our 
original meaning of translational states, but that no corresponding generalization was 
needed for the states of polarization of the preceding section, is an accidental o no 
with no underlying theoretical significance. 
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one of the beams by an observation. The other beam then no longer 
enters into the description of the photon, so that it counts as being 
entirely in the one beam in the ordinary way for any experiment that 
may subsequently be performed on it. 

On these lines quantum mechanics is able to effect a reconciliation 
of the wave and corpuscular properties of light. The essential point 
is the association of each of the translational states of a photon with 
one of the wave functions of ordinary wave optics. The nature of this 
association cannot be pictured on a basis of classical mechanics, but 
is something entirely new. It would be quite wrong to picture the 
photon and its associated wave as interacting in the way in wffiich 
particles and waves can interact in classical mechanics. The associa¬ 
tion can be interpreted only statistically, the wave function giving 
us information about the probability of our finding the photon in any 
particular place when we make an observation of where it is. 

Some time before the discovery of quantum mechanics people 
realized that the connexion between light waves and photons must 
be of a statistical character. What they did not clearly realize, how¬ 
ever, was that the wave function gives information about the proba¬ 
bility of one photon being in a particular place and not the probable 
number of photons in that place. The importance of the distinction 
can be made clear in the following way. Suppose we have a beam 
of light consisting of a large number of photons split up into two com¬ 
ponents of equal intensity. On the assumption that the intensity of 
a beam is connected with the probable number of photons in it, we 
should have half the total number of photons going into each com¬ 
ponent. If the two components are now made to interfere, we should 
require a photon in one component to be able to interfere with one in 
the other. Sometimes these two photons would have to annihilate one 
another and other times they would have to produce four photons. 
This would contradict the conservation of energy. The new theory, 
which connects the wave function with probabilities for one photon, 
gets over the difficulty by making each photon go partly into each of 
the two components. Each photon then interferes only with itself. 
Interference between two different photons never occurs. 

The association of particles with waves discussed above is not 
restricted to the case of light, but is, according to modern theory, 
of universal applicability. All kinds of particles are associated with 
waves in this way and conversely all wave motion is associated with 
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particles. Thus all particles can be made to exhibit interference 
effects and all wave motion has its energy in the form of quanta. The 
reason why these general phenomena are not more obvious is on 
account of a law of proportionality between the mass or energy of the 
particles and the frequency of the waves, the coefficient being such 
that for waves of familiar frequencies the associated quanta are 
extremely small, while for particles even as light as electrons the 
associated wave frequency is so high that it is not easy to demonstrate 
interference. 

4. Superposition and indeterminacy 

The reader may possibly feel dissatisfied with the attempt in the 
two preceding sections to fit in the existence of photons with the 
classical theory of light. He may argue that a very strange idea has 
been introduced—the possibility of a photon being partly in each of 
two states of polarization, or partly in each of two separate beams— 
but even with the help of this strange idea no satisfying picture of 
the fundamental single-photon processes has been given. He may say 
further that this strange idea did not provide any information about 
experimental results for the experiments discussed, beyond what 
could have been obtained from an elementary consideration of 
photons being guided in some vague way by waves. What, then, is 
the use of the strange idea ? 

In answer to the first criticism it may be remarked that the main 
object of physical science is not the provision of pictures, but is the 
formulation of laws governing phenomena and the application of 
these laws to the discovery of new phenomena. If a picture exists, 
so much the better; but whether a picture exists or not is a matter 
of only secondary importance. In the case of atomic phenomena 
no picture can be expected to exist in the usual sense of the word 
'picture 5 , by which is meant a model functioning essentially on 
classical lines. One may, however, extend the meaning of the word 
'picture 5 to include any way of looking at the fundamental laws which 
makes their self-consistency obvious. With this extension, one may 
gradually acquire a picture of atomic phenomena by becoming 
familiar with the laws of the quantum theory. 

With regard to the second criticism, it may be remarked that for 
many simple experiments with light, an elementary theory of waves 
and photons connected in a vague statistical way would be adequate 
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to account for the results. In the case of such experiments quantum 
mechanics has no further information to give. In the great majority 
of experiments, however, the conditions are too complex for an 
elementary theory of this kind to be applicable and some more 
elaborate scheme, such as is provided by quantum mechanics, is then 
needed. The method of description that quantum mechanics gives 
in the more complex cases is applicable also to the simple cases and 
although it is then not really necessary for accounting for the experi¬ 
mental results, its study in these simple cases is perhaps a suitable 
introduction to its study in the general case. 

There remains an overall criticism that one may make to the whole 
scheme, namely, that in departing from the determinacy of the 
classical theory a great complication is introduced into the descrip¬ 
tion of Nature, which is a highly undesirable feature. This complica¬ 
tion is undeniable, but it is offset by a great simplification, provided 
by the general principle of superposition of states , which we shall now 
go on to consider. But first it is necessary to make precise the impor¬ 
tant concept of a ‘state 5 of a general atomic system. 

Let us take any atomic system, composed of particles or bodies 
with specified properties (mass, moment of inertia, etc.) interacting 
according to specified laws of force. There will be various possible 
motions of the particles or bodies consistent with the laws of force. 
Each such motion is called a state of the system. According to 
classical ideas one could specify a state by giving numerical values 
to all the coordinates and velocities of the various component parts 
of the system at some instant of time, the whole motion being then 
completely determined. Now the argument of pp. 3 and 4 shows that 
we cannot observe a small system with that amount of detail which 
classical theory supposes. The limitation in the power of observation 
puts a limitation on the number of data that can be assigned to a 
state. Thus a state of an atomic system must be specified by fewer 
or more indefinite data than a complete set of numerical values 
for all the coordinates and velocities at some instant of time. In the 
case when the system is just a single photon, a state would be com¬ 
pletely specified by a given state of motion in the sense of § 3 
together with a given state of polarization in the sense of § 2. 

A state of a system may be defined as an undisturbed motion that 
is restricted by as many conditions or data as are theoretically 
possible without mutual interference or contradiction. In practice 
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the conditions could be imposed by a suitable preparation of the 
system, consisting perhaps in passing it through various kinds of 
sorting apparatus, such as slits and polarimeters, the system being 
left undisturbed after the preparation. The word c state’ may be 
used to mean either the state at one particular time (after the 
preparation), or the state throughout the whole of time after the 
preparation. To distinguish these two meanings, the latter will be 
called a ‘state of motion 5 when there is liable to be ambiguity. 

The general principle of superposition of quantum mechanics 
applies to the states, with either of the above meanings, of any one 
dynamical system. It requires us to assume that between these 
states there exist peculiar relationships such that whenever the 
system is definitely in one state we can consider it as being partly 
in each of two or more other states. The original state must be 
regarded as the result of a kind of superposition of the two or more 
new states, in a way that cannot be conceived on classical ideas. Any 
state may be considered as the result of a superposition of two or 
more other states, and indeed in an infinite number of ways. Con¬ 
versely any two or more states may be superposed to give a new 
state. The procedure of expressing a state as the result of super¬ 
position of a number of other states is a mathematical procedure 
that is always permissible, independent of any reference to physical 
conditions, like the procedure of resolving a wave into Fourier com¬ 
ponents. Whether it is useful in any particular case, though, depends 
on the special physical conditions of the problem under consideration. 

In the two preceding sections examples were given of the super¬ 
position principle applied to a system consisting of a single photon. 
§ 2 dealt with states differing only with regard to the polarization and 
§ 3 with states differing only with regard to the motion of the photon 
as a whole. 

The nature of the relationships which the superposition principle 
requires to exist between the states of any system is of a kind that 
cannot be explained in terms of familiar physical concepts. One 
cannot in the classical sense picture a system being partly in each of 
two states and see the equivalence of this to the system, being com¬ 
pletely in some other state. There is an entirely new idea involved, 
to which one must get accustomed and in terms of which one must 
proceed to build up an exact mathematical theory, without having 
any detailed classical picture. 



§4 


SUPERPOSITION AND INDETERMINACY 


13 


When a state is formed by the superposition of two other states, 
it will have properties that are in some vague way intermediate 
between those of the two original states and that approach more or 
less closely to those of either of them according, to the greater or less 
‘weight 5 attached to this state in the superposition process. The new 
state is completely defined by the two original states when their 
relative weights in the superposition process are known, together 
with a certain phase difference, the exact meaning of weights and 
phases being provided in the general case by the mathematical theory. 
In the case of the polarization of a photon their meaning is that pro¬ 
vided by classical optics, so that, for example, when two perpendicu¬ 
larly plane polarized states are superposed with equal weights, the 
new state may be circularly polarized in either direction, or linearly 
polarized at an angle Jtt, or else elliptically polarized, according to 
the phase difference. 

The non-classical nature of the superposition process is brought 
out clearly if we consider the superposition of two states, A and J5, 
such that there exists an observation which, when made on the 
system in state A, is certain to lead to one particular result, a say, and 
when made on the system in state B is certain to lead to some different 
result, b say. What will be the result of the observation when made 
on the system in the superposed state ? The answer is that the result 
will be sometimes a and sometimes b, according to a probability law 
depending on the relative weights of A and B in the superposition 
process. It will never be different from both a and b. The inter¬ 
mediate character of the state formed by superposition thus expresses 
itself through the probability of a particular result for an observation 
being intermediate between the corresponding probabilities for the original 
states, f not through the result itself being intermediate between the 
corresponding results for the original states. 

In this way we see that such a drastic departure from ordinary 
ideas as the assumption of superposition relationships between the 
states is possible only on account of the recognition of the importance 
of the disturbance accompanying an observation and of the conse¬ 
quent indeterminacy in the result of the observation. When an 
observation is made on any atomic system that is in a given state, 

f The probability of a particular result for the state formed by superposition is not 
always intermediate between those for the original states in the general case when 
those for the original states are not zero or unity, so there are restrictions on the 
‘ intermedlateness 1 of a state formed by superposition. 
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in general the result will not be determinate, i.e., if the experiment 
is repeated several times under identical conditions several different 
results may be obtained. It is a law of nature, though, that if the 
experiment is repeated a large number of times, each particular result 
will be obtained in a definite fraction of the total number of times, so 
that there is a definite probability of its being obtained. This proba¬ 
bility is what the theory sets out to calculate. Only in special cases 
when the probability for some result is unity is the result of the 
experiment determinate. 

The assumption of superposition relationships between the states 
leads to a mathematical theory in which the equations that define 
a state are linear in the unknowns. In consequence of this, people 
have tried to establish analogies with systems in classical mechanics, 
such as vibrating strings or membranes, which are governed by linear 
equations and for which, therefore, a superposition principle holds. 
Such analogies have led to the name ‘Wave Mechanics' being some¬ 
times given to quantum mechanics. It is important to remember, 
however, that the superposition that occurs in quantum mechanics is 
of an essentially different nature from any occurring in the classical 
theory , as is shown by the fact that the quantum superposition prin¬ 
ciple demands indeterminacy in the results of observations in order 
to be capable of a sensible physical interpretation. The analogies are 
thus liable to be misleading. 

5. Mathematical formulation of the principle 

A profound change has taken place during the present century in 
the opinions physicists have held on the mathematical foundations 
of their subject. Previously they supposed that the principles of 
Newtonian mechanics would provide the basis for the description 
of the whole of physical phenomena and that all the theoretical 
physicist had to do was suitably to develop and apply these prin¬ 
ciples. With the recognition that there is no logical reason why 
Newtonian and other classical principles should be valid outside the 
domains in which they have been experimentally verified has come 
the realization that departures from these principles are indeed 
necessary. Such departures find their expression through the intro¬ 
duction of new mathematical formalisms, new schemes of axioms 
and rules of manipulation, into the methods of theoretical physics. 

Quantum mechanics provides a good example of the new ideas. It 
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requires the states of a dynamical system and the dynamical variables 
to be interconnected in quite strange ways that are unintelligible 
from the classical standpoint. The states and dynamical variables 
have to be represented by mathematical quantities of different 
natures from those ordinarily used in physics. The new scheme 
becomes a precise physical theory when all the axioms and rules of 
manipulation governing the mathematical quantities are specified 
and when in addition certain laws are laid down connecting physical 
facts with the mathematical formalism, so that from any given 
physical conditions equations between the mathematical quantities 
may be inferred and vice versa. In an application of the theory one 
would be given certain physical information, which one would pro¬ 
ceed to express by equations between the mathematical quantities. 
One would then deduce new equations with the help of the axioms 
and rules of manipulation and would conclude by interpreting these 
new equations as physical conditions. The justification for the whole 
scheme depends, apart from internal consistency, on the agreement 
of the final results with experiment. 

We shall begin to set up the scheme by dealing with the mathe¬ 
matical relations between the states of a dynamical system at one 
instant of time, which relations will come from the mathematical 
formulation of the principle of superposition. The superposition pro¬ 
cess is a kind of additive process and implies that states can in some 
way be added to give new states. The states must therefore be con¬ 
nected with mathematical quantities of a kind which can be added 
together to give other quantities of the same kind. The most obvious 
of such quantities are vectors. Ordinary vectors, existing in a space 
of a finite number of dimensions, are not sufficiently general for 
most of the dynamical systems in quantum mechanics. We have to 
make a generalization to vectors in a space of an infinite number of 
dimensions, and the mathematical treatment becomes complicated 
by questions of convergence. For the present, however, we shall deal 
merely with some general properties of the vectors, properties which 
can be deduced on the basis of a simple scheme of axioms, and 
questions of convergence and related topics will not be gone into 
until the need arises. 

It is desirable to have a special name for describing the vectors 
which are connected with the states of a system in quantum mecha¬ 
nics, whether they are in a space of a finite or an infinite number of 
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dimensions. We shall call them ket vectors , or simply kets, and denote 
a general one of them by a special symbol |>. If we want to specify 
a particular one of them by a label, A say, we insert it in the middle, 
thus | A). The suitability of this notation will become clear as the 
scheme is developed. 

Ket vectors may be multiplied by complex numbers and may be 
added together to give other ket vectors, e.g. from two ket vectors 
|A> and \B) we can form 

c x \A>+c 2 \B>= | R>, ( 1 ) 


say, where c ± and c 2 are any two complex numbers. We may also 
perform more general linear processes with them, such as adding an 
infinite sequence of them, and if we have a ket vector \x), depending 
on and labelled by a parameter x which can take on all values in a 
certain range, we may integrate it with respect to x , to get another 
ket vector r 

J I x}dx= I Q> 


say. A ket vector which is expressible linearly in terms of certain 
others is said to be dependent on them. A set of ket vectors are called 
independent if no one of them is expressible linearly in terms of the 
others. 

We now assume that each state of a dynamical system at a particular 
time corresponds to a ket vector , the correspondence being such that if a 
state results from the superposition of certain other states , its correspond¬ 
ing ket vector is expressible linearly in terms of the corresponding ket 
vectors of the other states , and conversely . Thus the state R results from 
a superposition of the states A and B when the corresponding ket 
vectors are connected by (1). 

The above assumption leads to certain properties of the super¬ 
position process, properties which are in fact necessary for the word 
‘superposition 5 to be appropriate. When two or more states are 
superposed, the order in which they occur in the superposition 
process is unimportant, so the superposition process is symmetrical 
between the states that are superposed. Again, we see from equation 
(1) that (excluding the case when the coefficient c x or c 2 is zero) if 
the state R can be formed by superposition of the states A and B , 
then the state A can be formed by superposition of B and R, and B 
can be formed by superposition of A and R. The superposition 
relationship is symmetrical between all three states A, B , and JK. 
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A state which, results from the superposition of certain other 
states will be said to be dependent on those states. More generally, 
a state will be said to be dependent on any set of states, finite or 
infinite in number, if its corresponding ket vector is dependent on 
the corresponding ket vectors of the set of states. A set of states 
will be called independent if no one of them is dependent on the 
others. 

To proceed with the mathematical formulation of the superposition 
principle we must introduce a further assumption, namely the assump¬ 
tion that by superposing a state with itself we cannot form any new 
state, but only the original state over again. If the original state 
corresponds to the ket vector | Ay, when it is superposed with itself 
the resulting state will correspond to 

|-4>+c 2 \A> = (c 1 +c 2 )| j 4>, 

where c x and c 2 are numbers. Now we may have c x +c 2 = 0, in which 
case the result of the superposition process would be nothing at all, 
the two components having cancelled each other by an interference 
effect. Our new assumption requires that, apart from this special 
case, the resulting state must be the same as the original one, so that 
must correspond to the same state that | A) does. Now 
c i + c 2 * s an arbitrary complex number and hence we can conclude 
that if the ket vector corresponding to a state is multiplied by any 
complex number, not zero , the resulting ket vector will correspond to the 
same state. Thus a state is specified by the direction of a ket vector 
and any length one may assign to the ket vector is irrelevant. All 
the states of the dynamical system are in. one-one correspondence 
with all the possible directions for a ket vector, no distinction being 
made between the directions of the ket vectors | A} and — | Ay. 

The assumption just made shows up very clearly the fundamental 
difference between the superposition of the quantum theory and any 
kind of classical superposition. In the case of a classical system for 
which a superposition principle holds, for instance a vibrating mem¬ 
brane, when one superposes a state with itself the result is a different 
state, with a different magnitude of the oscillations. There is no 
physical characteristic of a quantum state corresponding to the 
magnitude of the classical oscillations, as distinct from their quality, 
described by the ratios of the amplitudes at different points of 
the membrane. Again, while there exists a classical state with zero 
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amplitude of oscillation everywhere, namely the state of rest, there 
does not exist any corresponding state for a quantum system, the 
zero ket vector corresponding to no state at all. 

Given two states corresponding to the ket vectors \A) and |J3>, 
the general state formed by superposing them corresponds to a ket 
vector |2?> which is determined by two complex numbers, namely 
the coefficients c x and c 2 of equation (1). If these two coefficients are 
multiplied by the same factor (itself a complex number), the ket 
vector |J2> will get multiplied by this factor and the corresponding 
state will be unaltered. Thus only the ratio of the two coefficients 
is effective in determining the state B. Hence this state is deter¬ 
mined by one complex number, or by two real parameters. Thus 
from two given states, a twofold infinity of states may be obtained 
by superposition. 

This result is confirmed by the examples discussed in §§ 2 and 3. 
In the example of § 2 there are just two independent states of polari¬ 
zation for a photon, which may be taken to be the states of plane 
polarization parallel and perpendicular to some fixed direction, and 
from the superposition of these two a twofold infinity of states of 
polarization can be obtained, namely all the states of elliptic polari¬ 
zation, the general one of which requires two parameters to describe 
it. Again, in the example of § 3, from the superposition of two given 
states of motion for a photon a twofold infinity of states of motion 
may be obtained, the general one of which is described by two 
parameters, which may be taken to be the ratio of the amplitudes 
of the two wave functions that are added together and their phase 
relationship. This confirmation shows the need for allowing complex 
coefficients in equation (1). If these coefficients were restricted to be 
real, then, since only their ratio is of importance for determining the 
direction of the resultant ket vector |J?> when \A) and \B) are 
given, there would be only a simple infinity of states obtainable from 
the superposition. 

6. Bra and ket vectors 

Whenever we have a set of vectors in any mathematical theory, 
we can always set up a second set of vectors, which mathematicians 
call the dual vectors. The procedure will be described for the case 
when the original vectors are our ket vectors. 

Suppose we have a number <f> which is a function of a ket vector 
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| A}, i.e. to each ket vector \A > there corresponds one number <f> } 
and suppose further that the function is a linear one, which means 
that the number corresponding to \A} J r \A f '} is the sum of the 
numbers corresponding to | A} and to | A'}, and the number corre¬ 
sponding to c\A) is c times the number corresponding to \Ay, c 
being any numerical factor. Then the number j> corresponding to 
any | A} may be looked upon as the scalar product ofdhat \A} with 
some new vector, there being one of these new vectors for^aeh linear 
function of the ket vectors | A}. The justification for this way of 
looking at <j> is that, as will be seen later (see equations (5) and (6)), 
the new vectors may be added together and may be multiplied by 
numbers to give other vectors of the same land. The new vectors 
are, of course, defined only to the extent that their scalar products 
with the original ket vectors are given numbers, but this is suffi¬ 
cient for one to be able to build up a mathematical theory about 
them. 

We shall call the new vectors bra vectors , or simply bras , and denote 
a general one of them by the symbol <|, the mirror image of the 
symbol for a ket vector. If we want to specify a particular one of 
them by a label, B say, we write it in tjaeTniddle, thus <J5|. The 
scalar product of a bra vector (B | and a ket vector |will be 
written i.e. as a juxtaposition of the symbols for the bi^a 

and ket vectors, that for the bra vector being on the left, and th^ 
two vertical lines being contracted to one for brevity. 

One may look upon the symbols < and > as a distinctive kind of 
brackets. A scalar product <£?|A> now appears as a complete bracket 
expression and a bra vector <jB| or a ket vector |Ay as an incomplete 
bracket expression. We have the rules that any complete bracket 
expression denotes a number and any incomplete bracket expression 
denotes a vector , of the bra or ket kind according to whether it contains 
the first or second part of the brackets. 

The condition that the scalar product of <J3| and \A} is a linear 
function of | A} may be expressed symbolically by 

<B\{\A >+1 A'}} = <B\A>+(B\A'>, (2) 

<B\{c\A>}~c<B\A>, (3) 

c being any number. 

A bra vector is considered to be completely defined when its scalar 
product with every ket vector is given, so that if a bra vector has its 
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scalar product with every ket vector vanishing, the bra vector itself 
must be considered as vanishing. In symbols, if 


<P\A} = 0, all \A>, 
then (P\ = 0. 



The sum of two bra vectors <J5| and (B'\ is defined by the condition 
that its scalar product with any ket vector | Ay is the sum of the 
scalar products of <J?| and < B'\ with | Ay, 

{<B\ + <B'\}\Ay = <B\A>+(B'\A>, (5) 

and the product of a bra vector-4^|-an^ a number c is defined by the 
condition that its scalar product with any jket vector \A) is c times 
the scalar product of (B | with | Ay, 

{c(B\}\A} = c<B\A>. (6) 


Equations (2) and (5) show that products of bra and ket vectors 
satisfy the distributive axiom of multiplication, and equations (3) 
and (6) show that multiplication by numerical factors satisfies the 
usual algebraic axioms. 

The bra vectors, as they have been here introduced, are quite a 
different kind of vector from the kets, and so far there is no connexion 
between them except for the existence of a scalar product of a bra 
and a ket. We now make the assumption that there is a one-one 
correspondence between the bras and the kets , such that the bra corre¬ 
sponding to \A}-{- \A'} is the sum of the bras corresponding to | Ay and 
to | A'y, and the bra corresponding to c\Ay is c times the bra corre¬ 
sponding to | Ay, c being the conjugate complex number to c. We shall 
use the same label to specify a ket and the corresponding bra. Thus 
the bra corresponding to \Ay will be written (A\. 

The relationship between a ket vector and the corresponding bra 
makes it reasonable to call one of them the conjugate imaginary of 
the other. Our bra and ket vectors are complex quantities, since they 
can be multiplied by complex numbers and are then of the same 
nature as before, but they are complex quantities of a special kind 
which cannot be split up into real and pure imaginary parts. The 
usual method of getting the real part of a complex quantity, by 
taking half the sum of the quantity itself and its conjugate, cannot 
be applied since a bra and a ket vector are of different natures and 
cannot be added together. To call attention to this distinction, we 
shall use the words ‘conjugate complex’ to refer to numbers and 
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other complex quantities which can he split up into real and pure 
imaginary parts, and the words ‘conjugate imaginary’ for bra and 
ket vectors, which cannot. With the former kind of quantity, we 
shall use the notation of putting a bar over one of them to get the 
conjugate complex one. 

On account of the one-one correspondence between bra vectors and 
ket vectors, any state of our dynamical system at a 'particular time may 
be specified by the direction of a bra vector just as well as by the direction 
of a ket vector. In fact the whole theory will be symmetrical in its 
essentials between bras and kets. 


Given any two ket vectors \Aj and |J3>, we can construct from 
them a number (B\A} by taking the scalar product of the first with 
the conjugate imaginary of the second. This number depends linearly 
on \A} and antilinearly on |B), the antilinear dependence meaning 
that the number formed from |J3'> is the sum of the numbers 

formed from |J5) and from |J5 / >, and the number formed from c|JS) 
is c times the number formed from |JS>. There is a second way in 
which we can construct a number which depends linearly on | A) and 
antilinearly on | By, namely by forming the scalar product of \B} 
with the conjugate imaginary of \A > and taking the conjugate com¬ 
plex of this scalar product. We assume that these two numbers are 


always equal, i.e. 


(B\Ay = (A\By 


(7) 


Putting |jB> = | A} here, we find that the number SA\A} must be 
real. We make the further assumption 

<A\A>>0, ( 8 ) 

except when | A} = 0. 

In ordinary space, from any two vectors one can construct a 
number—their scalar product—which is a real number and is sym¬ 
metrical between them. In the space of bra vectors or the space of 
ket vectors, from any tw r o vectors one can again construct a number 
—the scalar product of one with the conjugate imaginary of the 
other—but this number is complex and goes over into the conjugate 
complex number when the two vectors are interchanged. There is 
thus a kind of perpendicularity in these spaces, which is a generaliza¬ 
tion of the perpendicularity in ordinary space. We shall call a bra 
and a ket vector orthogonal if their scalar product is zero, and two 
bras or two kets will be called orthogonal if the scalar product of one 
with the conjugate imaginary of the other is zero. Further, we shall 
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say that two states of our dynamical system are orthogonal if the 
vectors corresponding to these states are orthogonal. 

The length of a bra vector (A ] or of the conjugate imaginary ket 
vector |Ay is defined as the square root of the positive number 
(A\A)>. When we are given a state and wish to set up a bra or ket 
vector to correspond to it, only the direction of the vector is given 
and the vector itself is undetermined to the extent of an arbitrary 
numerical factor. It is often convenient to choose this numerical 
factor so that the vector is of length unity. This procedure is called 
normalization and the vector so chosen is said to be normalized. The 
vector is not completely determined even then, since one can still 
multiply it by any number of modulus unity, i.e. any number 
where y is real, without changing its length. We shall call such a 
number a phase factor. 

The foregoing assumptions give the complete scheme of relations 
between the states of a dynamical system at a particular time. The 
relations appear in mathematical form, but they imply physical 
conditions, which will lead to results expressible in terms of observa¬ 
tions when the theory is developed further. For instance, if two states 
are orthogonal, it means at present simply a certain equation in our 
formalism, but this equation implies a definite physical relationship 
between the states, which further developments of the theory will 
enable us to interpret in terms of observational results (see the 
bottom of p. 35). 
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Iisr the preceding section we considered a number which is a linear 
function of a ket vector, and this led to the concept of a bra vector. 
We shall now consider a ket vector which is a linear function of a 
ket vector, and this will lead to the concept of a linear operator. 

Suppose we have a ket |jF> which is a function of a ket |A>, i.e. 
to each ket | A} there corresponds one ket \F}, and suppose further 
that the function is a linear one, which means that the \F ) corre¬ 
sponding to |A'> is the sum of the |jF>’s corresponding to |A> 
and to | A'}, and the |jF> corresponding to c\A} is c times the \F ) 
corresponding to |A), c being any numerical factor. Under these 
conditions, we may look upon the passage from ]A> to | F} as the 
application of a linear operator to | A}. Introducing the symbol a 
for the linear operator, we may write 

\F> = *\A\ 

in which the result of a operating on | A} is written like a product 
of a with | A}. We make the rule that in such products the ket vector 
must always be put on the right of the linear operator. The above 
conditions of linearity may now be expressed by the equations 

\A')} = a|A) + a|A / ), 
a{c]A)} = Coc\A )>. 

A linear operator is considered to be completely defined when the 
result of its application to every ket vector is given. Thus a linear 
operator is to be considered zero if the result of its application to every 
ket vanishes, and two linear operators are to be considered equal if 
they produce the same result when applied to every ket. 

Linear operators can be added together, the sum of two linear 
operators being defined to be that linear operator which, operating 
on any ket, produces the sum of what the two linear operators 
separately would produce. Thus ol+iS is defined by 

{oc+P}\A> - *\A}+f3\A> (2) 

for any |A). Equation (2) and the first of equations (1) show that 
products of linear operators with ket vectors satisfy the distributive 
axiom of multiplication. 
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Linear operators can also be multiplied together, the product of 
two linear operators being defined as that linear operator, the appli¬ 
cation of which to any ket produces the same result as the application 
of the two linear operators successively. Thus the product aft is 
defined as the linear operator which, operating on any ket [A},, 
changes it into that ket which one would get by operating first on 
| A} with j 8 , and then on the result of the first operation with «. In 

symMs m A> = 4m». 

This definition appears as the associative axiom of multiplication for 
the triple product of a, / 3 , and |A>, and allows us to write this triple 
product as a/}| A} without brackets. However, this triple product is 
in general not the same as what we should get if we operated on \A} 
first with a and then with /?, i.e. in general ocfi\A} differs from ficx\A), 
so that in general a/3 must differ from ficc. The commutative axiom of 
multiplication does not hold for linear operators . It may happen as a 
special case that two linear operators £ and 77 are such that £77 and 
7]g are equal. In this case we say that £ commutes with 77 , or that £ 
and 77 commute . 

By repeated applications of the above processes of adding and 
multiplying linear operators, one can form sums and products of 
more than two of them, and one can proceed to build up an algebra 
with them. In this algebra the commutative axiom of multiplication 
does not hold, and also the product of two linear operators may 
vanish without either factor vanishing. But all the other axioms of 
ordinary algebra, including the associative and distributive axioms 
of multiplication, are valid, as may easily be verified. 

If we take a number k and multiply it into ket vectors, it appears 
as a linear operator operating on ket vectors, the conditions ( 1 ) being 
fulfilled with k substituted for a. A number is thus a special case of 
a linear operator. It has the property that it commutes with all linear 
operators and this property' distinguishes it from a general linear 
operator. 

So far we have considered linear operators operating only on ket 
vectors. We can give a meaning to their operating also on bra vectors, 
in the following way. Take the scalar product of any bra <B| with 
the ket a (A). This scalar product is a number which depends 
linearly on | A} and therefore, from the definition of bras, it may be 
considered as the scalar product of | A} with some bra. The bra thus 
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defined depends linearly on <P1, so we may look upon it as the result of 
some linear operator applied to <P|. This linear operator is uniquely 
determined by the original linear operator a and may reasonably be 
called the same linear operator operating on a bra. In this way our 
linear operators are made capable of operating on bra vectors. 

A suitable notation to use for the resulting bra when a operates on 
the bra <J3| is <J3|a, as in this notation the equation which defines 

<BWl8 {<J5!«}|^> = <J3|{*Mt» (3) 

for any |A>, which simply expresses the associative axiom of multi¬ 
plication for the triple product of (B |, oc, and |A). We therefore 
make the general rule that in a product of a bra and a linear operator, 
the bra must always be put on the left. We can now write the triple 
product of <J3|, a, and | A) simply as <P|a|A> without brackets. It 
may easily be verified that the distributive axiom of multiplication 
holds for products of bras and linear operators just as well as for 
products of linear operators and kets. 

There is one further kind of product which has a meaning in our 
scheme, namely the product of a ket vector and a bra vector with 
the ket on the left, such as |A><P|. To examine this product, let us 
multiply it into an arbitrary ket |P>, putting the ket on the right, 
and assume the associative axiom of multiplication. The product is 
then |A)<J5|P), which is another ket, namely | A) multiplied by the 
number <P|P>, and this ket depends linearly on the ket |P>. Thus 
|A)<P| appears as a linear operator that can operate on kets. It 
can also operate on bras, its product with a bra (Q | on the left being 
(Q\Ay(B\, which is the number <Q|A> times the bra <JB|. The 
product |A><P| is to be sharply distinguished from the product 
(B\A) of the same factors in the reverse order, the latter product 
being, of course, a number. 

We now have a complete algebraic scheme involving three kinds 
of quantities, bra vectors, ket vectors, and linear operators. They can 
be multiplied together in the various ways discussed above, and the 
associative and distributive axioms of multiplication always hold, 
but the commutative axiom of multiplication does not hold. In this 
general scheme we still have the rules of notation of the preceding 
section, that any complete bracket expression, containing < on the 
left and > on the right, denotes a number, while any incomplete 
bracket expression, containing only < or >, denotes a vector. 
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With regard to the physical significance of the scheme, we have 
already assumed that the bra vectors and ket vectors, or rather the 
directions of these vectors, correspond to the states of a dynamical 
system at a particular time. We now make the further assumption 
that the linear operators correspond to the dynamical variables at that 
time . By dynamical variables are meant quantities such as the 
coordinates and the components of velocity, momentum and angular 
momentum of particles, and functions of these quantities—in fact 
the variables in terms of which classical mechanics is built up. The 
new assumption requires that these quantities shall occur also in 
quantum mechanics, but with the striking difference that they are 
now subject to an algebra in which the commutative axiom 5 . of multiplica¬ 
tion does not hold. 

This different algebra for the dynamical variables is one of the 
most important ways in which quantum mechanics differs from 
classical mechanics. We shall see later on that, in spite of this funda¬ 
mental difference, the dynamical variables of quantum mechanics 
still have many properties in common with their classical counter¬ 
parts and it will be possible to build up a theory of them closely 
analogous to the classical theory and forming a beautiful generaliza¬ 
tion of it. 

It is convenient to use the same letter to denote a dynamical 
variable and the corresponding linear operator. In fact, we may con¬ 
sider a dynamical variable and the corresponding linear operator to 
be both the same thing, without getting into confusion. 

8. Conjugate relations 

Our linear operators are complex quantities, since one can multiply 
them by complex numbers and get other quantities of the same nature. 
Hence they must correspond in general to complex dynamical vari¬ 
ables, i.e. to complex functions of the coordinates, velocities, etc. We 
need some further development of the theory to see what kind of 
linear operator corresponds to a real dynamical variable. 

Consider the ket which is the conjugate imaginary of <P|«. This 
ket depends antilinearly on <P| and thus depends linearly on |P>, 
It may therefore be considered as the result of some linear operator 
operating on |P>. This linear operator is called the adjoint of a and 
we shall denote it by a. With this notation, the conjugate imaginary 
of <P|a is a|P>. 
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In formula (7) of Chapter I put <P|a for <A j and its conjugate 
imaginary a|P> for | A}. The result is 

<P|a|P> = <P|«|JS>. (4) 

This is a general formula holding for any ket vectors |B>, |P> and 
any linear operator a, and it expresses one of the most frequently 
used properties of the adjoint. 

Putting ql for a in (4), we get 

<P|S|P> = <P|d|P> = <S|«|P>, 

by using (4) again with |P> and \B} interchanged. This holds for 
any ket |P>, so we can infer from (4) of Chapter I, 

and since this holds for any bra vector <jB|, we can infer 

a = a. 

Thus the adjoint of the adjoint of a linear operator is the original linear 
operator. This property of the adjoint makes it like the conjugate 
complex of a number, and it is easily verified that in the special case 
when the linear operator is a number, the adjoint linear operator is 
the conjugate complex number. Thus it is reasonable to assume that 
the adjoint of a linear operator corresponds to the conjugate complex of 
a dynamical variable. With this physical significance for the adjoint 
of a linear operator, we may call the adjoint alternatively the con¬ 
jugate complex linear operator , which conforms with our notation a. 

A linear operator may equal its adjoint, and is then called self- 
adjoint. It corresponds to a real dynamical variable, so it may be 
called alternatively a real linear operator . Any linear operator may 
be split up into a real part and a pure imaginary part. For this 
reason the words "conjugate complex 5 are applicable to linear 
operators and not the words ‘conjugate imaginary 5 . 

The conjugate complex of the sum of two linear operators is 
obviously the sum of their conjugate complexes. To get the conjugate 
complex of the product of two linear operators a and /?, we apply 
formula (7) of Chapter I with 

a i = <pi«, <b\ = «2i& 

so that |4> = a|P>, \B} = p\Q>. 

The result is 

<<2ij8«|p> - <pmo> = <eH9|p> - 
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from ( 4 ). Since this holds for any |P> and (Q\, we can infer that 

fa = aj8. (5) 

Thus the conjugate complex of the product of two linear operators equals 
the product of the conjugate complexes of the factors in the reverse order . 

As simple examples of this result, it should be noted that, if £ and 
7) are real, in general £rj is not real. This is an important difference 
from classical mechanics. However, £ 77 + rj£ is real, and so is i(£rj—7]£). 
Only when £ and 77 commute is £77 itself also real. Further, if £ is real, 
then so is f 2 and, more generally, £ n with n any positive integer. 

We may get the conjugate complex of the product of three linear 
operators by successive applications of the rule (5) for the conjugate 
complex of the product of two of them. We have 

(xfa = a(j8y) = fa a = yfi a, ((>) 

so the conjugate complex of the product of three linear operators 
equals the product of the conjugate complexes of the factors in the 
reverse order. The rule may easily be extended to the product of any 
number of linear operators. 

In the preceding section we saw that the product | J.)<Z^| is a linear 
operator. We may get its conjugate complex by referring directly to 
the definition of the adjoint. Multiplying |A><P| into a general bra 
<P| we get <P|A><J3|, whose conjugate imaginary ket is 

<PjI)|P> = <A\P)\B> = |jB><A|P>. 

Hence |A><P| = ( 7 ) 

We now have several rules concerning conjugate complexes and 
conjugate imaginaries of products, namely equation ( 7 ) of Chapter I, 
equations (4), (5), ( 6 ), (7) of this chapter, and the rule that the 
conjugate imaginary of <P|a is a|P>. These rules can all be summed 
up in a single comprehensive rule, the conjugate complex or conjugate 
imaginary of any product of bra vectors , ket vectors , and linear operators 
is obtained by taking the conjugate complex or conjugate imaginary of 
each factor and reversing the order of all the factors. The rule is easily 
verified to hold quite generally, also for the cases not explicitly given 
above. 

Theorem. If £ is a real linear operator and 

£ m \ P> = 0 

for a particular ket |P), m being a positive integer , then 

f I P> = 0 . 


( 8 ) 
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To prove the theorem, take first the case when m 
(8) then gives <P|P|P> = 0, 


2. Equation 


showing that the ket £|P> multiplied by the conjugate imaginary bra 
<P|£ is zero. From the assumption (8) of Chapter I with g |P> for | A), 
we see that g | P> must be zero. Thus the theorem is proved for m = 2. 
Now take m > 2 and put 


g m ~ 2 \P> = |«>. 

Equation (8) now gives g 2 \Q} = 0. 


Applying the theorem for m = 2, we get 


or g 7n ~ l \P') = 0. (9) 

By repeating the process by which equation (9) is obtained from 
(8), we obtain successively 

g m ~ 2 \py = o, £m-3|p> o 3 | 2 |P> = o, £|P> = o, 
and so the theorem is proved generally. 


9. Eigenvalues and eigenvectors 

We must make a further development of the theory of linear 
operators, consisting in studying the equation 

a|P> = a|P>, (10) 

where a is a linear operator and a is a number. This equation usually 
presents itself in the form that a is a known linear operator and the 
number a and the ket |P> are unknowns, which we have to try to 
choose so as to satisfy (10), ignoring the trivial solution |P> = 0. 

Equation (10) means that the linear operator a applied to the ket 
|P> just multiplies this ket by a numerical factor without changing 
its direction, or else multiplies it by the factor zero, so that it ceases 
to have a direction. This same a applied to other kets will, of course, 
in general change both their lengths and their directions. It should 
be noticed that only the direction of |P> is of importance in equation 
(10). If one multiplies |P> by any number not zero, it will not affect 
the question of whether (10) is satisfied or not. 

Together with equation (10), we should consider also the conjugate 
imaginary form of equation 

<ci« = 6<qi, (ii) 

where b is a number. Here the unknowns are the number b and the 
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non-zero bra (Q |. Equations (10) and (11) are of such fundamental 
importance in the theory that it is desirable to have some special 
words to describe the relationships between the quantities involved. 
If (10) is satisfied, we shall call a an eigenvalue f of the linear operator 
a, or of the corresponding dynamical variable, and we shall call |P> 
an eigenket of the linear operator or dynamical variable. Further, we 
shall say that the eigenket |P> belongs to the eigenvalue a . Similarly, 
if (11) is satisfied, we shall call b an eigenvalue of a and <$| an 
eigenbra belonging to this eigenvalue. The words eigenvalue, eigen¬ 
ket, eigenbra have a meaning, of course, only with reference to a linear 
operator or dynamical variable. 

Using this terminology, we can assert that, if an eigenket of a is 
multiplied by any number not zero, the resulting ket is also an 
eigenket and belongs to the same eigenvalue as the original one. 
It is possible to have two or more independent eigenkets of a linear 
operator belonging to the same eigenvalue of that linear operator, 
e.g. equation (10) may have several solutions, |P1>, |P2>, |P3>,... say, 
all holding for the same value of a, with the various eigenkets |Pl>, 
|P2>, |P3>,... independent. In this case it is evident that any linear 
combination of the eigenkets is another eigenket belonging to the 
same eigenvalue of the linear operator, e.g. 

c 1 |Pl)+c a |P2>+c 3 |P3>+... 

is another solution of (10), where c v c 2 ,c 3 ,... are any numbers. 

In the special case when the linear operator oc of equations (10) and 
(11) is a number, k say, it is obvious that any ket |P> and bra <Q| 
will satisfy these equations provided a and b equal 1c. Thus a number 
considered as a linear operator has just one eigenvalue, and any ket 
is an eigenket and any bra is an eigenbra, belonging to this eigenvalue. 

The theory of eigenvalues and eigenvectors of a linear operator a 
which is not real is not of much use for quantum mechanics. We 
shall therefore confine ourselves to real linear operators for the further 
development of the theory. Putting for a the real linear operator £, 
we have instead of equations (10) and (11) 

£|P> = a \Py, (12) 

<Q\£ = b(Q\ . (13) 

^ t The word ‘proper ’ is sometimes used instead of ‘ eigen \ but this is not satisfactory 
as the words proper’ and ‘improper’ are often used with other meanings. For example, 
in §§ 15 and 46 the words ‘improper function’ and ‘proper-energy’ are used. 
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Three important results can now be readily deduced. 

(i) The eigenvalues are all real numbers. To prove that a satisfying 
(12) is real, we multiply (12) by the bra <P| on the left, obtaining 

<P|£|P> = a<P|P>. 

Now from equation (4) with <J3| replaced by <P| and a replaced by 
the real linear operator we see that the number <P|f |P> must be 
real, and from (8) of § 6, <P|P> must be real and not zero. Hence a 
is real. Similarly, by multiplying (13) by \Q} on the right, we can 
prove that 6 is real. 

Suppose we have a solution of (12) and we form the conjugate 
imaginary equation, which will read 

<P\t = a{P\ 

in view of the reality of £ and a. This conjugate imaginary equation 
now provides a solution of (13), with (Q\ = <P| and b = a. Thus 
we can infer 

(ii) The eigenvalues associated with eigenkets are the same as the 
eigenvalues associated with eigenbras. 

(iii) The conjugate imaginary of any eigenket is an eigenbra belonging 
to the same eigenvalue , and conversely. This last result makes it reason¬ 
able to call the state corresponding to any eigenket or to the conjugate 
imaginary eigenbra an eigenstate of the real dynamical variable 

Eigenvalues and eigenvectors of various real dynamical variables 
are used very extensively in quantum mechanics, so it is desirable 
to have some systematic notation for labelling them. The following 
is suitable for most purposes. If £ is a real dynamical variable, we 
call its eigenvalues | r , etc. Thus we have a letter by itself 

denoting a real dynamical variable or a real linear operator , and the 
same letter with primes or an index attached denoting a number , 
namely an eigenvalue of what the letter by itself denotes. An eigen¬ 
vector may now be labelled by the eigenvalue to which it belongs. 
Thus |£'> denotes an eigenket belonging to the eigenvalue of the 
dynamical variable f. If in a piece of work we deal with more than 
one eigenket belonging to the same eigenvalue of a dynamical variable, 
we may distinguish them one from Another by means of a further 
label, or possibly of more than one further labels. Thus, if we are 
dealing with two eigenkets belonging to the same eigenvalue of 
we may call them |f'l> and |£'2>. 



32 DYNAMICAL VARIABLES AND OBSERVABLES § 9 

Theorem. Two eigenvectors of a real dynamical variable belonging 
to different eigenvalues are orthogonal. 

To prove the theorem, let |£'> and |£"> be two eigenkets of the real 
dynamical variable £, belonging to the eigenvalues £' and £" respec¬ 
tively. Then we have the equations 

= (i4) 

fir> = nr>. a®) 

Taking the conjugate imaginary of (14), we get 
Multiplying this by |£*> on the right gives 

<f'i£ir> = e<e m 

and multiplying (15) by <f'| on the left gives 

<*'i£ir> = r<r r>. 

Hence, subtracting, (£'-— £")<£' |£"> — 0, (16) 

showing that, if £' ^ £', <£'|£"> = 0 and the two eigenvectors |£'> 
and |£"> are orthogonal. This theorem will be referred to as the 
orthogoTMlity theorem. 

We have been discussing properties of the eigenvalues and eigen¬ 
vectors of a real linear operator, but have not yet considered the 
question of whether, for a given real linear operator, any eigenvalues 
and eigenvectors exist, and if so, how to find them. This question 
is in general very difficult to answer. There is one useful special case, 
however, which is quite tractable, namely when the real linear 
operator, £ say, satisfies an algebraic equation 

<K£) = ?*“ 1 +a a £*- a +...+a B , = 0, (17) 

the coefficients a being numbers. This equation means, of course, 
that the linear operator <f>(£) produces the result zero when applied 
to any ket vector or to any bra vector. 

Let (17) be the simplest algebraic equation that £ satisfies. Then 
it will be shown that 

(a) The number of eigenvalues of £ is n. 

(p) There are so many eigenkets of £ that any ket whatever can 
be expressed as a sum of such eigenkets. 

The algebraic form <£(£) can be factorized into n linear factors the 
result being ^, (f) _ ( i_c lK £-<, s)(f _c s) ... ( f_c„ ) (18, 
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say, the c’s being numbers, not assumed to be all different. This 
factorization can be performed with £ a linear operator just as well 
as with £ an ordinary algebraic variable, since there is nothing 
occurring in (18) that does not commute with £. Let the quotient 
when <f>(£) is divided by (£—c r ) be 'Xr(£)> so that 
—(f-Or)Xrtf) (r= 1,2,3, 

Then, for any ket |P>, 

(£-c r ) Xr (£)\P> = <K£)\P> = 0 . (19) 

Now Xr(£)\Py cannot vanish for every ket |P>, as otherwise xA£) 
itself would vanish and we should have £ satisfying an algebraic 
equation of degree n— 1, which would contradict the assumption that 
(17) is the simplest equation that £ satisfies. If we choose |P> so that 
X r (!)|P> does not vanish, then equation (19) shows that x r (£)|P> * s 
an eigenket of £, belonging to the eigenvalue c r . The argument holds 
for each value of r from 1 to n, and hence each of the c’s is an eigen¬ 
value of £. No other number can be an eigenvalue of £, since if £' is 
any eigenvalue, belonging to an eigenket |£'>, 

£\£'> = f !f> 

and we can deduce <f)(£)\£'} = </)(£ , )\£ , y, 

and since the left-hand side vanishes we must have ${£') = 0. 

To complete the proof of (a) we must verify that the c’s are all 
different. Suppose the c’s are not all different and c s occurs m times 
say, with m > 1. Then </>(£) is of the form 

</>(£) ^(^c s re(£), 

with 9(£) a rational integral function of £. Equation (17) now gives us 
(£—c s ) m 6(£)\Ay = 0 (20) 

for any ket | A}. Since c s is an eigenvalue of £ it must be real, so that 
£—c s is a real linear operator. Equation (20) is now of the same form 
as equation (8) with c 8 for £ and 9(£) j A} for |P>. From the theorem 
connected with equation (8) we can infer that 

(£-c s m\A> = 0 . 

Since the ket | A} is arbitrary, 

(£-c a )9(£) = 0 , 

which contradicts the assumption that (17) is the simplest equation 
that £. satisfies. Hence the c’s are all different and (a) is proved. 

Let Xr( c r) be the number obtained when c r is substituted for £ in 

8695.57 T> 
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When we measure a real dynamical variable £, the disturbance 
involved in the act of measurement causes a jump in the state of the 
dynamical system. From physical continuity, if we make a second 
measurement of the same dynamical variable £ immediately after 
the first, the result of the second measurement must be the same as 
that of the first. Thus after the first measurement has been made, 
there is no indeterminacy in the result of the second. Hence, after 
the first measurement has been made, the system is in an eigenstate 
of the dynamical variable £, the eigenvalue it belongs to being equal 
to the result of the first measurement. This conclusion must still hold 
if the second measurement is not actually made. In this way we see 
that a measurement always causes the system to jump into an eigen¬ 
state of the dynamical variable that is being measured, the eigenvalue 
this eigenstate belongs to being equal to the result of the measure¬ 
ment. 

We can infer that, with the dynamical system in any state, any 
result of a measurement of a real dynamical variable is one of its eigen¬ 
values. Conversely, every eigenvalue is a possible result of a measure¬ 
ment of the dynamical variable for some state of the system , since it is 
certainly the result if the state is an eigenstate belonging to this 
eigenvalue. This gives us the physical significance of eigenvalues. 
The set of eigenvalues of a real dynamical variable are just the 
possible results of measurements of that dynamical variable and the 
calculation of eigenvalues is for this reason an important problem. 

Another assumption we make connected with the physical inter¬ 
pretation of the theory is that, if a certain real dynamical variable 
£ is measured with the system in a particular state , the states into which 
the system may jump on account of the measurement are such that the 
original state is dependent on them. Now these states into which 
the system may jump are all eigenstates of £, and hence the original 
state is dependent on eigenstates of But the original state may be 
any state, so we can conclude that any state is dependent on eigen¬ 
states of £. If we define a complete set of states to be a set such that 
any state is dependent on them, then our conclusion can be formu¬ 
lated—the eigenstates of £ form a complete set. 

Not every real dynamical variable has sufficient eigenstates to form 
a complete set. Those whose eigenstates do not form complete sets 
are not quantities that can be measured. "We obtain in this way a 
further condition that a dynamical variable has to satisfy in order 
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that it shall be susceptible to measurement, in addition to the con¬ 
dition that it shall be real. We call a real dynamical variable whose 
eigenstates form a complete set an observable . Thus any quantity 
that can be measured is an observable. 

The question now presents itself—Can every observable be 
measured? The answer theoretically is yes. In practice it may be 
very awkward, or perhaps even beyond the ingenuity of the experi¬ 
menter, to devise an apparatus which could measure some particular 
observable, but the theory always allows one to imagine that the 
measurement can be made. 

Let us examine mathematically the condition for a real dynamical 
variable £ to be an observable. Its eigenvalues may consist of a 
(finite or infinite) discrete set of numbers, or alternatively, they 
may consist of all numbers in a certain range, such as all numbers 
lying between a and h. In the former case, the condition that 
any state is dependent on eigenstates of £ is that any ket can 
be expressed as a sum of eigenkets of In the latter case the 
condition needs modification, since one may have an integral instead 
of a sum, i.e. a ket |P> may be expressible as an integral of eigen¬ 
kets of j (24) 


\p> = J if > 


|£'> being an eigenket of f belonging to the eigenvalue and the 
range of integration being the range of eigenvalues, as such a ket is 
dependent on eigenkets of f. Not every ket dependent on eigenkets 
of £ can be expressed in the form of the right-hand side of (24), since 
one of the eigenkets itself cannot, and more generally any sum of 
eigenkets cannot. The condition for the eigenstates of £ to form a 
complete set must thus be formulated, that any ket |P> can be 
expressed as an integral plus a sum of eigenkets of i.e. 


\P>= f Ife>df+ZIM>, (25) 

J r 

where the |£'c>, \£ r d} are all eigenkets of £, the labels c and d being 
inserted to distinguish them when the eigenvalues £' and are equal, 
and where the integral is taken over the whole range of eigenvalues 
and the sum is taken over any selection of them. If this condition 
is satisfied in the case when the eigenvalues of £ consist of a range 
of numbers, then £ is an observable. 

There is a more general case that sometimes occurs, namely the 
eigenvalues of £ may consist of a range of numbers together with a 
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discrete set of numbers lying outside the range. In this case the 
condition that g shall be an observable is still that any ket shall be 
expressible in the form of the right-hand side of (25), but the sum 
over r is now a sum over the discrete set of eigenvalues as well as a 
selection of those in the range. 

It is often very difficult to decide mathematically whether a par¬ 
ticular real dynamical variable satisfies the condition for being an 
observable or not, because the whole problem of finding eigenvalues 
and eigenvectors is in general very difficult. However, we may have 
good reason on experimental grounds for believing that the dynamical 
variable can be measured and then we may reasonably assume that it 
is an observable even though the mathematical proof is missing. This is 
a thing we shall frequently do during the course of development of the 
theory, e.g. we shall assume the energy of any dynamical system to be 
always an observable, even though it is beyond the power of present- 
day mathematical analysis to prove it so except in simple cases. 

In the special case when the real dynamical variable is a number, 
every state is an eigenstate and the dynamical variable is obviously 
an observable. Any measurement of it always gives the same result, 
so it is just a physical constant, like the charge on an electron. 
A physical constant in quantum mechanics may thus be looked upon 
either as an observable with a single eigenvalue or as a mere number 
appearing in the equations, the two points of view being equivalent. 

If the real dynamical variable satisfies an algebraic equation, then 
the result (fi) of the preceding section shows that the dynamical 
variable is an observable. Such an observable has a finite number 
of eigenvalues. Conversely, any observable with a finite number of 
eigenvalues satisfies an algebraic equation, since if the observable g 
has as its eigenvalues g\g",...,g n 9 then 

holds for | P> any eigenket of g } and thus it holds for any |P> what¬ 
ever, because any ket can be expressed as a sum of eigenkets of g 
on account of g being an observable. Hence 

(e-m-n~(£-£*) = o. ( 2«) 

As an example we may consider the linear operator \A)(A\, where 
\A)> is a normalized ket. This linear operator is real according to (7), 
and its square is 

(L4XJI}*= \AXA\AXA\ = \AXA\ 


(27) 
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since (A |A> — 1. Thus its square equals itself and so it satisfies an 
algebraic equation and is an observable. Its eigenvalues are 1 and 0, 
with |Ay as the eigenket belonging to the eigenvalue 1 and all kets 
orthogonal to | A} as eigenkets belonging to the eigenvalue 0. A 
measurement of the observable thus certainly gives the result 1 if 
the dynamical system is in the state corresponding to | A} and the 
result 0 if the system is in any orthogonal state, so the observable 
may be described as the quantity which determines whether the 
system is in the state | A) or not. 

Before concluding this section we should examine the conditions 
for an integral such as occurs in (24) to be significant. Suppose \X ) 
and | Y) are two kets which can be expressed as integrals of eigenkets 
of the observable 

ix> = J ir*> ir> = 

x and y being used as labels to distinguish the two integrands. Then 
we have, taking the conjugate imaginary of the first equation and 
multiplying by the second 

<J|r> = JJ<fz| (28) 

Consider now the single integral 

/<«'»>#'. (29) 

From the orthogonality theorem, the integrand here must vanish 
over the whole range of integration except the one point 
If the integrand is finite at this point, the integral (29) vanishes, and 
if this holds for all we get from (28) that <X| 7> vanishes. Now 
in general <X| 7) does not vanish, so in general (£'x\g'y) must be 
infinitely great in such a way as to make (29) non-vanishing and 
finite. The form of infinity required for this will be discussed in § 15. 

In our work up to the present it has been implied that our bra and 
ket vectors are of finite length and their scalar products are finite. 
We see now the need for relaxing this condition when we are dealing 
with eigenvectors of an observable whose eigenvalues form a range. 
-If we did not relax it, the phenomenon of ranges of eigenvalues could 
not occur and our theory would be too weak for most practical 
problems. 1 
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Ta kin g | Y > = |Z> above, we get the result that in general (£'x\g'x} 
is infinitely great. We shall assume that if \g'x) =£ 0 


s 


> 0, (30) 

as the axiom corresponding to (8) of § 6 for vectors of infinite 
length. 

The space of bra or ket vectors when the vectors are restricted to 
be of finite length and to have finite scalar products is called by 
mathematicians a Hilbert space. The bra and ket vectors that we 
now use form a more general space than a Hilbert space. 

We can now see that the expansion of a ket |P> in the form of the 
right-hand side of (25) is unique, provided there are not two or more 
terms in the sum referring to the same eigenvalue. To prove this 
result, let us suppose that two different expansions of JP> are pos¬ 
sible. Then by subtracting one from the other, we get an equation 
of the form 0 = f |£' a > + ]T |£*&>, (31) 

J S 


a and b being used as new labels for the eigenvectors, and the sum 
over s including all terms left after the subtraction of one sum from 
the other. If there is a term in the sum in (31) referring to an eigen¬ 
value ^ not in the range, we get, by multiplying (31) on the left by 
<jt*b\ and using the orthogonality theorem, 


0 = <£W>, 


which contradicts (8) of § 6. Again, if the integrand in (31) does not 
vanish for some eigenvalue not equal to any occurring in the 
sum, we get, by multiplying (31) on the left by <£"a| and using the 
orthogonality theorem, 

o = j a"a\? a > dr, 

which contradicts (30). Finally, if there is a term in the sum in (31) 
referring to an eigenvalue g 1 in the range, we get, multiplying (31) on 
the left by <£% |, 

0 = j<?b\?a>d? +<£W> (32) 

and multiplying (31) on the left by <j?a\ 

o = j <^a |ray dr +wm. (33) 

Now the integral in (33) is finite, so is finite and is 

finite. The integral in (32) must then be zero, so <£%|£*6> is zero and 
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we again have a contradiction. Thus every term in (31) must vanish 
and the expansion of a ket |P> in the form of the right-hand side of 
(25) must be unique. 

11. Functions of observables 

Let | be an observable. We can multiply it by any real number k 
and get another observable kg In order that our theory may be 
self-consistent it is necessary that, when the system is in a state such 
that a measurement of the observable £ certainly gives the result g 
a measurement of the observable shall certainly give the result kg. 
It is easily verified that this condition is fulfilled. The ket correspond¬ 
ing to a state for which a measurement of g certainly gives the result 
g is an eigenket of £, \g} say, satisfying 

m = m. 

This equation leads to 

«lf> = «'l f>, 

showing that |f') is an eigenket of kg belonging to the eigenvalue kg ', 
and thus that a measurement of kg will certainly give the result kg . 

More generally, we may take any real function of g, f(g) say, and 
consider it as a new observable which is automatically measured 
whenever g is measured, since an experimental determination of the 
value of £ also provides the value of f{g). We need not restrict f(g) to 
be real, and then its real and pure imaginary parts are two observables 
which are automatically measured when g is measured. For the theory 
to be consistent it is necessary that, when the system is in a state 
such that a measurement of g certainly gives the result g, a measure¬ 
ment of the real and pure imaginary parts of f(g) shall certainly give 
for results the real and pure imaginary parts of/(£'). In the case when 
f(g) is expressible as a power series 

fig = c 0 +c 1 £+c 2 £ 2 +c 8 £ 3 +—> 

the c’s being numbers, this condition can again be verified by elemen¬ 
tary algebra. In the case of more general functions / it may not be 
possible to verify the condition. The condition may then be used to 
define f(g), which we have not yet defined mathematically. In this 
way we can get a more general definition of a function of an observ¬ 
able than is provided by power series. 

We define f(g) in general to be that linear operator which satisfies 

m\e>=m\e> m 
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for every eigenket |£'> of £,/(£') being a number for each eigenvalue £'. 
It is easily seen that this definition is self-consistent when applied to 
eigenkets |£'> that are not independent. If we have an eigenket \£'Ay 
dependent on other eigenkets of £, these other eigenkets must all 
belong to the same eigenvalue otherwise we should have an equa¬ 
tion of the type (31), which we have seen is impossible. On multiplying 
the equation which expresses | £'Ay linearly in terms of the other 
eigenkets of £ b yf(£) on the left, we merely multiply each term in it 
by the number /(£'), so we obviously get a consistent equation. 
Further, equation (34) is sufficient to define the linear operator f(£) 
completely, since to get the result of f(£) multiplied into an arbitrary 
ket |P>, we have only to expand |P> in the form of the right-hand 
side of (25) and take 

m\ p> = / /(f) ifo di' + 2/(f) i ed>. (35) 

The conjugate complex /(£) of /(£) is defined by the conjugate 
imaginary equation to (34), namely 

<fi m =/(fxf i. 

holding for any eigenbra <f|, /(f) being the conjugate complex 
function to /(f). Let us replace f here by f and multiply the 
equation on the right by the arbitrary ket |P>. Then we get, using 
the expansion (25) for |P>, 

<r m ip> =/(fxnp> 

= //(fxfifc> d? + |/rxf \m 

= f /(fXf'lfc) df +/(f)<f'lfd> (36) 

with the help of the orthogonality theorem, <£"\£"dy being under¬ 
stood to be zero if £ n is not one of the eigenvalues to which the terms 
in the sum in (25) refer. Again, putting the conjugate complex 
function f{£ r ) for/(£') in (35) and multiplying on the left by <£*|, 
we get 

<nm\p> = / /(f xr ifc> df +/(f xnfd>. 

The right-hand side here equals that of (36), since the integrands 
vanish for f ^ f, and hence 

<f m\p> =<fi/(fip>. 
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This holds for <|"| any eigenbra and |P> any ket, so 

W) =?(£)■ ( 37 ) 

Thus the conjugate complex of the linear operator f(£) is the conjugate 
complex function / of £. 

It follows as a corollary that if /(£') is a real function of /(£) is 
a real linear operator. /(£) is then also an observable, since its 
eigenstates form a complete set, every eigenstate of £ being also an 
eigenstate of /(£). 

With the above definition we are a&Ze to give a meaning to any 
function f of an observable , provided only that the domain of existence 
of the function of a real variable f(x) includes all the eigenvalues of the 
observable. If the domain of existence contains other points besides 
these eigenvalues, then the values of f(x) for these other points will 
not affect the function of the observable. The function need not be 
analytic or continuous. The eigenvalues of a function / of an observ¬ 
able are just the function / of the eigenvalues of the observable. 

It is important to observe that the possibility of defining a function 
/ of an observable requires the existence of a unique number f(x) for 
each value of x which is an eigenvalue of the observable. Thus the 
function f(x) must be single-valued. This may be illustrated by con¬ 
sidering the question: When we have an observable f(A) which is a 
real function of the observable A, is the observable A a function of 
the observable/(A) ? The answer to this is yes, if different eigenvalues 
A' of A always lead to different values of f(A'). If, however, there 
exist two different eigenvalues of A, A' and A" say, such that 
f(A') = fiA"), then, corresponding to the eigenvalue f(A') of the 
obsfervabl ef{A), there will not be a unique eigenvalue of the observ¬ 
able A and the latter will not be a function of the observable/(A). 

It may easily be verified mathematically, from the definition, that 
the sum or product of two functions of an observable is a function 
of that observable and that a function of a function of an observable 
is a function of that observable. Also it is easily seen that the whole 
theory of functions of an observable is symmetrical between bras and 
kets and that we could equally well work from the equation 

<?\m =/(n<ri ( 38 ) 

instead of from (34). 

We shall conclude this section with a discussion of two examples 
which are of great practical importance, namely the reciprocal and 
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the square root. The reciprocal of an observable exists if the observ¬ 
able does not have the eigenvalue zero. If the observable a does not 
have the eigenvalue zero, the reciprocal observable, which we call a- 1 
or 1/a, wih satisfy = a '-i| a '>, (39) 

where |a') is an eigenket of a belonging to the eigenvalue a . Hence 

= oLOL~ 1 \ot!') = |a'>. 

Since this holds for any eigenket \a'y, we must have 

aa" 1 = 1. (40) 

Similarly, oc^oc = 1. (41) 

Either of these equations is sufficient to determine a -1 completely , 
provided a does not have the eigenvalue zero. To prove this in the 
case of (40), let x be any linear operator satisfying the equation 

OiX = 1 


and multiply both sides on the left by the a" 1 defined by (39). The 
result is 


or x (xx = or 


and hence from (41) x — or 1 . 

Equations (40) and (41) can be used to define the reciprocal, when 
it exists, of a general linear operator a, which need not even be real. 
One of these equations by itself is then not necessarily sufficient. If 
any two linear operators a and /? have reciprocals, their product a/3 
has the reciprocal {a ^_i = (42 ) 

obtained by taking the reciprocal of each factor and reversing their 
order. We verify (42) by noting that its right-hand side gives unity 
when multiplied by a/3, either on the right or on the left. This reci¬ 
procal law for products can be immediately extended to more than 
two factors, i.e., = ... y - 

The square root of an observable a always exists, and is real if a 
has no negative eigenvalues. We write it Va or a*. It satisfies 

Va|a'> = (43) 

la') being an eigenket of a belonging to the eigenvalue a'. Hence 
VaVa|a') = Va'Va'ja') = a'|a') = a|a'), 
and since this holds for any eigenket |a'> we must have 

VaVa = a. 


(44) 
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On account of the ambiguity of sign in (43) there will be several 
square roots. To fix one of them we must specify a particular sign 
in (43) for each eigenvalue. This sign may vary irregularly from one 
eigenvalue to the next and equation (43) will always define a linear 
operator Va satisfying (44) and forming a square-root function of a. 
If there is an eigenvalue of a with two or more independent eigenkets 
belonging to it, then we must, according to our definition of a func¬ 
tion, have the same sign in (43) for each of these eigenkets. If we 
took different signs, however, equation (44) would still hold, and hence 
equation (44) by itself is not sufficient to define Va, except in the 
special case when there is only one independent eigenket of a belong¬ 
ing to any eigenvalue. 

The number of different square roots of an observable is 2 n , where 
n is the total number of eigenvalues not zero. In practice the square- 
root function is used only for observables without negative eigen¬ 
values and the particular square root that is useful is the one for 
which the positive sign is always taken in (43). This one will be called 
the 'positive square root. 

12. The general physical interpretation 

The assumptions that we made at the beginning of § 10 to get a 
physical interpretation of the mathematical theory are of a rather 
special kind, since they can be used only in connexion with eigen¬ 
states. We need some more general assumption which will enable us 
to extract physical information from the mathematics even when we 
are not dealing with eigenstates. 

In classical mechanics an observable always, as we say, 'has a 
value 5 for any particular state of the system. What is there in quan¬ 
tum mechanics corresponding to this ? If we take any observable | 
and any two states x and y 3 corresponding to the vectors (x\ and | y}, 
then we can form the number y}. This number is not very 
closely analogous to the value which an observable can 'have’ in the 
classical theory, for three reasons, namely, (i) it refers to two states 
of the system, while the classical value always refers to one , (ii) it is 
in general not a real number, and (iii) it is not uniquely determined 
by the observable and the states, since the vectors <x\ and | y'y contain 
arbitrary numerical factors. Even if we impose on <a?| and |y> the 
condition that they shall be normalized, there will still be an undeter¬ 
mined factor of modulus unity in <z|£|?/>. These three reasons cease 
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to apply, however, if we take the two states to be identical and |yy 
to be the conjugate imaginary vector to (x\. The number that we 
then get, namely (x\£\x) } is necessarily real, and also it is uniquely 
determined when (x\ is normalized, since if we multiply (x\ by the 
numerical factor e ic , c being some real number, we must multiply 
\x} by e~ ic and (x\£\x} will be unaltered. 

One might thus be inclined to make the tentative assumption that 
the observable £ ‘has the value’ <a?|£|#> for the state in a sense 
analogous to the classical sense. This would not be satisfactory, 
though, for the following reason. Let us take a second observable r7, 
which would have by the above assumption the value (x\ 7 ]\x} for 
this same state. We should then expect, from classical analogy, that 
for this state the sum of the two observables would have a value 
equal to the sum of the values of the two observables separately and 
the product of the two observables would have a value equal to the 
product of the values of the two observables separately. Actually, the 
tentative assumption would give for the sum of the two observables 
the value <x\£+r]\x'), which is, in fact, equal to the sum of <x\$\xy 
and (x\tj\x } 3 but for the product it would give the value (x\£r)\x) 
or (x\r}t;\x}, neither of which is connected in any simple way with 
<x\g\x} and (x\r)\x). 

However, since things go wrong only with the product and not with 
the sum, it would be reasonable to call (jx\^\xy the average value of 
the observable £ for the state x. This is because the average of the 
sum of two quantities must equal the sum of their averages, but the 
average of their product need not equal the product of their averages. 
We therefore make the general assumption that if the measurement 
of the observable £ for the system in the state corresponding to \x} is 
made a large number of times , the average of all the results obtained will 
be (x\£\x>, provided \x > is normalized. If |z> is not normalized, as is 
necessarily the case if the state x is an eigenstate of some observable 
belonging to an eigenvalue in a range, the assumption becomes that 
the average result of a measurement of | is proportional to <&|£|a;>. 
This general assumption provides a basis for a general physical inter¬ 
pretation of the theory. 

The expression that an observable ‘has a particular value 5 for a 
particular state is permissible in quantum mechanics in the special 
case when a measurement of the observable is certain to lead to the 
particular value, so that the state is an eigenstate of the observable. 
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It may easily be verified from the algebra that, with this restricted 
meaning for an observable ‘having a value’, if two observables have 
values for a particular state, then for this state the sum of the two 
observables (if this sum is an observablef) has a value equal to the 
sum of the values of the two observables separately and the product 
of the two observables (if this product is an observable $) has a value 
equal to the product of the values of the two observables separately. 

In the general case we cannot speak of an observable having a value 
for a particular state, but we can speak of its having an average value 
for the state. We can go further and speak of the probability of its 
having any specified value for the state, meaning the probability of 
this specified value being obtained when one makes a measurement of 
the observable. This probability can be obtained from the general 
assumption in the following way. 

Let the observable be £ and let the state correspond to the normal¬ 
ized ket \x). Then the general assumption tells us, not only that the 
average value of £ is <#|£|a;>, but also that the average value of any 
function of £,f(£) say, is (ps\f{£) \x}. Take/(£) to be that function of £ 
which is equal to unity when £ = a, a being some real number, and 
zero otherwise. This function of £ has a meaning according to our 
general theory of functions of an observable, and it may be denoted 
by 8f a in conformity with the general notation of the symbol 8 with 
two suffixes given on p. 62 (equation (17)). The average value of 
this function of £ is just the probability, P a say, of £ having the value 


a . Thus 


Pa = CsISfJ*)- 


(45) 


If a is not an eigenvalue of £, 8^ a multiplied into any eigenket of £ is 
zero, and hence 0 and P a = 0. This agrees with a conclusion 
of § 10, that any result of a measurement of an observable must be 
one of its eigenvalues. 

If the possible results of a measurement of £ form a range of num¬ 
bers, the probability of £ having exactly a particular value will be 
zero in most physical problems. The quantity of physical importance 
is then the probability of £ having a value within a small range, say 
from a to a+da. This probability, which we may call P(a) da , is 


f This is not obviously so, since the sum may not have sufficient eigenstates to 
form a complete set, in which case the sum, considered as a single quantity, would 
not be- measurable. 

% Here the reality condition may fail, as well as the condition for the eigenstates 
to form a complete set. 
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equal to the average value of that function of g which is equal to 
unity for g lying within the range a to a+da and zero otherwise. 
This function of g has a meaning according to our general theory of 
functions of an observable. Denoting it by x(f)> we have 

P(a) da = <x |x(£) !«>• (46) 

If the range a to a+da does not include any eigenvalues of g, we 
have as above x(g) = 0 and p ( a ) = °* If l^> is not normalized, the 
right-hand sides of (45) and (46) will still be proportional to the 
probability of g having the value a and lying within the range a to 
a+da respectively. 

The assumption of § 10, that a measurement of g is certain to give 
the result g' if the system is in an eigenstate of g belonging to the 
eigenvalue g' } is consistent with the general assumption for physical 
interpretation and can in fact be deduced from it. Working from the 
general assumption we see that, if \g') is an eigenket of g belonging 
to the eigenvalue g', then, in the case of discrete eigenvalues of g. 

If) = 0 unless a = g\ 

and in the case of a range of eigenvalues of g 

) = 0 unless the range a to a+da includes g\ 

In either case, for the state corresponding to | £'>, the probability of 
g having any value other than g r is zero. 

An eigenstate of g belonging to an eigenvalue g' lying in a range 
is a state which cannot strictly be realized in practice, since it would 
need an infinite amount of precision to get g to equal exactly g'. 
The most that could be attained in practice would be to get g to lie 
within a narrow range about the value g\ The system would then 
be in a state approximating to an eigenstate of g. Thus an eigenstate 
belonging to an eigenvalue in a range is a mathematical idealization 
of what can be attained in practice. All the same such eigenstates 
play a very useful role in the theory and one could not very well do 
without them. Science contains many examples of theoretical con¬ 
cepts which are limits of things met with in practice and are useful 
for the precise formulation of laws of nature, although they are not 
realizable experimentally, and this is just one more of them. It may 
be that the infinite length of the ket vectors corresponding to these 
eigenstates is connected with their unrealizability, and that all realiz¬ 
able states correspond to ket vectors that can be normalized and that 
form a Hilbert space. 
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13. Commutability and compatibility 
A state may be simultaneously an eigenstate of two observables. 
If the state corresponds to the ket vector | A> and the observables are 
£ and rj, we should then have the equations 

£\Ay = £'\A\ 
rj\Ay = r]'\Ay, 

where £' and ?/ are eigenvalues of £ and rj respectively. We can now 
deduce 

£r)\Ay — £ri'\Ay = £'r)’\Ay — f T)\Ay = ijf'i Ay = Ay, 

or (i-q—rt^lAy — o. 

This suggests that the chances for the existence of a simultaneous 
eigenstate are most favourable if £rj—rj£ = 0 and the two observables 
commute. If they do not commute a simultaneous eigenstate is not 
impossible, but is rather exceptional. On the other hand, if they do 
commute there exist so many simultaneous eigenstates that they form a 
complete set , as will now be proved. 

Let £ and rj be two commuting observables. Take an eigenket of 
rj, | rj'} say, belonging to the eigenvalue rj', and expand it in terms 
of eigenkets of £ in the form of the right-hand side of (25), thus 

IV> = f l£V c > + 2 \£ r v'd>- (47) 

j r 

The eigenkets of £ on the right-hand side here have rf inserted in 
them as an extra label, in order to remind us that they come from 
the expansion of a special ket vector, namely \rj'}, and not a general 
one as in equation (25). We can now show that each of these eigen¬ 
kets of £ is also an eigenket of rj belonging to the eigenvalue rf. We 
have 

0 = (17 — 7]')\r)’y — f (r] — rj')\i'rj'cy d£' + 2 (V —( 48 ) 

J r 

Now the ket (rj—rj')\£ r rj'dy satisfies 

£(v— r i')\£ r v' d > = (v—' i i')£\£ r v' d '> — (i—n')£ r \ti rr i'd> 

= £ r (v— 

showing that it is an eigenket of £ belonging to the eigenvalue £ r , 
and similarly the ket (rj—rj')\£'rj'cy is an eigenket of £ belonging to 
the eigenvalue £'. Equation (48) thus gives an integral plus a sum 
of eigenkets of £ equal to zero, which, as we have seen with equation 
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(31), is impossible unless the integrand and every term in the sum 
vanishes. Hence 

(y— V)I£V C > = (y—y‘) \£ r v'd> = o, 

so that all the kets appearing on the right-hand side of (47) are 
eigenkets of rj as well as of £. Equation (47) now gives ?/> expanded 
in terms of simultaneous eigenkets of £ and rj. Since any ket can be 
expanded in terms of eigenkets |i/> of q, it follows that any ket can 
be expanded in terms of simultaneous eigenkets of £ and q, and thus 
the simultaneous eigenstates form a complete set. 

The above simultaneous eigenkets of £ and q, | £'q'c) and \£ r q'd), 
are labelled by the eigenvalues £' and q, or £ r and q, to which they 
belong, together with the labels c and d which may also be necessary. 
The procedure of using eigenvalues as labels for simultaneous eigen¬ 
vectors will be generally followed in the future, just as it has been 
followed in the past for eigenvectors of single observables. 

The converse to the above theorem says that, if £ and q are two 
observables such that their simultaneous eigenstates form a complete set, 
then £ and q commute. To prove this, we note that, if \£'q') is a 
simultaneous eigenket belonging to the eigenvalues £' and q , 

(£*?— yOlH'y'y = {?y—y'£')Wy'> = o. (49) 

Since the simultaneous eigenstates form a complete set, an arbitrary 
ket |P> can be expanded in terms of simultaneous eigenkets | £'q’y, 
for each of which (49) holds, and hence 

(£y—q£)\P> = 0 
and so £y—y£ = 0. 

The idea of simultaneous eigenstates may be extended to more 
than two observables and the above theorem and its converse still 
hold, i.e. if any set of observables commute, each with all the others, 
their simultaneous eigenstates form a complete set, and conversely. 
The same arguments used for the proof with two observables are 
adequate for the general case; e.g., if we have three commuting 
observables £, q, £, we can expand any simultaneous eigenket of £ 
and rj in terms of eigenkets of £ and then show that each of these 
eigenkets of £ is also an eigenket of £ and of q. Thus the simultaneous 
eigenket of £ and q is expanded in terms of simultaneous eigenkets 
of £, q, and £, and since any ket can be expanded in terms of simul¬ 
taneous eigenkets of £ and q, it can also be expanded in terms of 
simultaneous eigenkets of £, q, and £. 
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The orthogonality theorem applied to simultaneous eigenkets tells 
us that two simultaneous eigenvectors of a set of commuting observ¬ 
ables are orthogonal if the sets of eigenvalues to which they belong 
differ in any way. 

Owing to the simultaneous eigenstates of two or more commuting 
observables forming a complete set, we can set up a theory of func¬ 
tions of two or more commuting observables on the same lines as the 
theory of functions of a single observable given in § 11. If 77 , £,... 
are commuting observables, we define a general function / of them 
to be that linear operator /(£, 77 , £,...) which satisfies 

M, v, = /(£'. V, r,...)lfV£'--.>, (50) 

where is any simultaneous eigenket of belonging 

to the eigenvalues rj\ . Here f is any function such that 
f(a,b,c,...) is defined for all values of a, which are eigenvalues 
of £, 77 , respectively. As with a function of a single observable 
defined by (34), we can show that /(£, 77 , £,...) is completely deter¬ 
mined by (50), that 

/(£*£,••■) =/(£ 77, £,..)> 

corresponding to (37), and that if f(a,b,c,...) is a real function, 
/(|, 77 , £,...) is real and is an observable. 

We can now proceed to generalize the results (45) and (46). Given 
a set of commuting observables £, 77 , we may form that function 
of them which is equal to unity when £ = a, rj — 6 , £ = c,..., a, 6 , c,... 
being real numbers, and is equal to zero when an}^ of these conditions 
is not fulfilled. This function may be written 8 ^ 8 ^ 8 ^..., and is in 
fact just the product in any order of the factors 8g a , 8 vb , 8 j cs ... defined 
as functions of single observables, as may be seen by substituting this 
product for /(£, 77 , £,...) in the left-hand side of (50). The average 
value of this function for any state is the probability, P a6c say, of 
i, 77 , £,... having the values a, 6 , c,... respectively for that state. Thus 
if the state corresponds to the normalized ket vector \x}, we get from 
our general assumption for physical interpretation 

Pabc... ~ (5l) 

P abc _ is zero unless each of the numbers a,b,c is an eigenvalue of 
the corresponding observable. If any of the numbers is an 

eigenvalue in a range of eigenvalues of the corresponding observable, 
Pabc... wifi usually again be zero, but in this case we ought to replace 
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the requirement that this observable shall have exactly one value by 
the requirement that it shall have a value lying within a small range, 
which involves replacing one of the § factors in (51) by a factor like 
the x(£) °f equation (46). On carrying out such a replacement for 
each of the observables 77 , whose corresponding numerical 
value a , 6 , c,... lies in a range of eigenvalues, we shall get a proba¬ 
bility which does not in general vanish. 

If certain observables commute, there exist states for which they all 
have particular values, in the sense explained at the bottom of p. 46, 
namely the simultaneous eigenstates. Thus one can give a meaning to 
several commuting observables having values at the same time. Further, we 
see from (51) that for any state one can give a meaning to the probability 
of particular results being obtained for simultaneous measurements of 
several commuting observables. This conclusion is an important new 
development. In general one cannot make an observation on a 
system in a definite state without disturbing that state and spoiling 
it for the purposes of a second observation. One cannot then give 
any meaning to the two observations being made simultaneously. 
The above conclusion tells us, though, that in the special case when 
the two observables commute, the observations are to be considered 
as non-interfering or compatible , in such a way that one can give a 
meaning to the two observations being made simultaneously and can 
discuss the probability of any particular results being obtained. The 
two observations may, in fact, be considered as a single observation 
of a more complicated type, the result of which is expressible by two 
numbers instead of a single number. From the point of view of general 
theory , any two or more commuting observables may be counted as a 
single observable , the result of a measurement of which consists of two or 
more numbers. The states for which this measurement is certain to 
lead to one particular result are the simultaneous eigenstates. 
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14. Basic vectors 

In the preceding chapters we set up an algebraic scheme involving 
certain abstract quantities of three kinds, namely bra vectors, ket 
vectors, and linear operators, and we expressed some of the funda¬ 
mental laws of quantum mechanics in terms of them. It would be 
possible to continue to develop the theory in terms of these abstract 
quantities and to use them for applications to particular problems. 
However, for some purposes it is more convenient to replace the 
abstract quantities by sets of numbers with analogous mathematical 
properties and to work in terms of these sets of numbers. The proce¬ 
dure is similar to using coordinates in geometry, and has the advan¬ 
tage of giving one greater mathematical power for the solving of 
particular problems. 

The way in which the abstract quantities are to be replaced by 
numbers is not unique, there being many possible ways corresponding 
to the many systems of coordinates one can have in geometry. Each 
of these ways is called a representation and the set of numbers that 
replace an abstract quantity is called the representative of that 
abstract quantity in the representation. Thus the representative of 
an abstract quantity corresponds to the coordinates of a geometrical 
object. When one has a particular problem to work out in quantum 
mechanics, one can minimize the labour by using a representation 
in which the representatives of the more important abstract quanti¬ 
ties occurring in that problem are. as simple as possible. 

To set up a representation in a general way, we take a complete 
set of bra vectors, i.e. a set such that any bra can be expressed 
linearly in terms of them (as a sum or an integral or possibly an 
integral plus a sum). These bras we call the basic bras of the repre¬ 
sentation. They are sufficient, as we shall see, to fix the representation 
completely. 

Take any ket \a) and form its scalar product with each of the basic 
bras. The numbers so obtained constitute the representative of |a>. 
They are sufficient to determine the ket |a) completely, since if there 
is a second ket, \a{) say, for which these numbers are the same, the 
difference |a>—1%) will have its scalar product with any basic bra 
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vanishing, and hence its scalar product with any bra whatever will 
vanish and |a>— |a x > itself will vanish. 

We may suppose the basic bras to be labelled by one or more 
parameters, X v A 2 ,..., X u , each of which may take on certain numerical 
values. The basic bras will then be written <A X A 2 ...AJ and the repre¬ 
sentative of |a> will be written <A X A 2 ...AJa>. This representative will 
now consist of a set of numbers, one for each set of values that 
A l5 A 2 ,...,A u may have in their respective domains. Such a set of 
numbers just forms a function of the variables A l5 A 2 ,..., X u . Thus the 
representative of a ket may be looked upon either as a set of numbers 
or as a function of the variables used to label the basic bras. 

If the number of independent states of our dynamical system is 
finite, equal to n say, it is sufficient to take n basic bras, which may 
be labelled by a single parameter A taking on the values 1,2, 

The representative of any ket | a) now consists of the set of n numbers 
<l|u>, <2|a>, <3|a>,..., 0|a>, which are precisely the coordinates of 
the vector \a) referred to a system of coordinates in the usual way. 
The idea of the representative of a ket vector is just a generalization 
of the idea of the coordinates of an ordinary vector and reduces to 
the latter when the number of dimensions of the space of the ket 
vectors is finite. 

In a general representation there is no need for the basic bras to 
be all independent. In most representations used in practice, how¬ 
ever, they are all independent, and also satisfy the more stringent 
condition that any two of them are orthogonal. The representation 
is then called an orthogonal representation. 

Take an orthogonal representation with basic bras ^A^.-A^l, 
labelled by parameters A 1? A 2 ,...,A M whose domains are all real. Take 
a ket | a) and form its representative <A X A 2 ...AJa>. Now form the 
numbers A 1 <A 1 A 2 ...A li |a> and consider them as the representative of 
a new ket |6>. This is permissible since the numbers forming the 
representative of a ket are independent, on account of the basic braa 
being independents The ket |6> is defined by the equation 

<A X A 2 ...AJ6> = A 1 < (A 1 A 2 ...A w |<xX 

The ket |6> is evidently a linear function of the ket |a>, so it may 
be considered as the result of a linear operator applied to |a>. Calling 
this linear operator L v we have 

l&> = L x | a} 
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and hence <A 1 A 2 ...A, U \L X \a) — A 1 <A 1 A 2 ...A M \a}. 

This equation holds for any ket | <z>, so we get 

<A 1 A 2 ...A w | L x — A 1 <A 1 A 2 ...A w |. (1) 

Equation ( 1 ) may be looked upon as the definition of the linear 
operator L v It shows that each basic bra is an eigenbra of L v the 
value of the parameter A x being the eigenvalue belonging to it. 

From the condition that the basic bras are orthogonal we can 
deduce that L x is real and is an observable. Let A*, A 2 ,..., A^ and 
Ai,A 2 ,...,A" be two sets of values for the parameters A 1 ; A 2 ,...,A w . 
We have, putting A'’s for the A’s in ( 1 ) and multiplying on the right 
by |AiA 2 ...A^>, the conjugate imaginary of the basic bra <A£ A 2 ...A"|, 

<a;a'...a;il 1 |a;%..o = a;<a;a;...a;iaia;...a;>. 

Interchanging A /5 s and A^’s, 

<a;'a;...a;il 1 |a;a'...a;> = a;<aj a;...a;;ia; a;...a;>. 


On account of the basic bras being orthogonal, the right-hand sides 
here vanish unless A" = A^ for all r from 1 to u , in which case the 
right-hand sides are equal, and they are also real, A* being real. Thus, 
whether the A" 5 s are equal to the A'’s or not, 


<AiA'...A;|L 1 |A!A 2 \..^> = <AJ A 2 ... A^ | | Ai A 2 ... A^> 

= <KK-K\Li\KK-K> 

from equation (4) of § 8 . Since the A 2 ...A^|’s form a complete set, 
of bras and the |A£A 2 ...A">’s form a complete set of kets, we can 
infer that L x = L v The further condition required for L x to be an 
observable, namely that its eigenstates shall form a complete set, is 
obviously satisfied since it has as eigenbras the basic bras, which 
form a complete set. 

We can similarly introduce linear operators L 2 , L u by multi¬ 
plying <A 1 A 2 ...A w |a> by the factors A 2 , A 3 ,...,A W in turn and considering 
the resulting sets of numbers as representatives of kets. Each of these 
I7s can be shown in the same way to have the basic bras as eigenbras 
and to be real and an observable. The basic bras are simultaneous 
eigenbras of all the L’s. Since these simultaneous eigenbras form a 
complete set, it follows from a theorem of § 13 that any two of the 
L’s commute. 

It will now be shown that, if £ l9 £ u are any set of commuting 
observables , we can set up an orthogonal representation in which the basic 
bras are simultaneous eigenbras of £ u . Let us suppose first that 
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there is only one independent simultaneous eigenbra of £ l9 £ u 
belonging to any set of eigenvalues Then we may take 

these simultaneous eigenbras, with arbitrary numerical coefficients, as 
our basic bras. They are all orthogonal on account of the orthogonality 
theorem (any two of them will have at least one eigenvalue different, 
which is sufficient to make them orthogonal) and there are sufficient 
of them to form a complete set, from a result of § 13. They may 
conveniently be labelled by the eigenvalues £[, £ 2 ,•»,£« to which they 
belong, so that one of them is written (£[ 

Passing now to the general case when there are several independent 
simultaneous eigenbras of £ lf £ u belonging to some sets of eigen¬ 
values, we must pick out from all the simultaneous eigenbras belong¬ 
ing to a set of eigenvalues £[, £ 2 ,..., g u a complete subset, the members 
of which are all orthogonal to one another. (The condition of com¬ 
pleteness here means that any simultaneous eigenbra belonging to the 
eigenvalues can be expressed linearly in terms of the 

members of the subset.) We must do this for each set of eigenvalues 
and then put all the members of all the subsets together 
and take them as the basic bras of the representation. These bras 
are all orthogonal, two of them being orthogonal from the orthogona¬ 
lity theorem if they belong to different sets of eigenvalues and from 
the special way in which they were chosen if they belong to the same 
set of eigenvalues, and they form altogether a complete set of bras, 
as any bra can be expressed linearly in terms of simultaneous eigen¬ 
bras and each simultaneous eigenbra can then be expressed linearly 
in terms of the members of a subset. There are infinitely many ways 
of choosing the subsets, and each way provides one orthogonal 
representation. 

For labelling the basic bras in this general case, we may use the 
eigenvalues £i a £ 2 >—f° which they belong, together with certain 
additional real variables X V X Z ,...,X V say, which must be introduced to 
distinguish basic vectors belonging to the same set of eigenvalues 
from one another. A basic bra is then written <£&...& A X A 2 ...A W |. 
Corresponding to the variables X l9 A 2 ,... 5 A ? , we can define linear 
operators L v L v by equations like (1) and can show that these 
linear operators have the basic bras as eigenbras, and that they are 
real and observables, and that they commute with one another and 
with the |’s. The basic bras are now simultaneous eigenbras of all 
the commuting observables £ v £ u> L v L v . 
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Let us define a complete set of commuting observables to be a set of 
observables which all commute with one another and for which there 
is only one simultaneous eigenstate belonging to any set of eigen¬ 
values. Then the observables L v form a complete 

set of commuting observables, there being only one independent simul¬ 
taneous eigenbra belonging to the eigenvalues £ 2 ,..., A 1} A 2 ,... 5 A^, 

namely the corresponding basic bra. Similarly the observables 
L v L 2 ,..., L u defined by equation (1) and the following work form 
a complete set of commuting observables. With the help of this 
definition the main results of the present section can be concisely 
formulated thus: 

(i) The basic bras of an orthogonal representation are simul¬ 
taneous eigenbras of a complete set of commuting observ¬ 
ables. 

(ii) Given a complete set of commuting observables, we can set 
up an orthogonal representation in which the basic bras are 
simultaneous eigenbras of this complete set. 

(iii) Any set of commuting observables can be made into a com¬ 
plete commuting set by adding certain observables to it. 

(iv) A convenient way of labelling the basic bras of an orthogonal 
representation is by means of the eigenvalues of the complete 
set of commuting observables of which the basic bras are 
simultaneous eigenbras. 

The conjugate imaginaries of the basic bras of a representation we 
call the basic kets of the representation. Thus, if the basic bras are 
denoted by <A 1 A 2 ...A W |, the basic kets will be denoted by |A X A 2 ...A >W >. 
The representative of a bra <6| is given by its scalar product with 
each of the basic kets, i.e. by <6|A X A 2 ...A W >. It may, like the repre¬ 
sentative of a ket, be looked upon either as a set of numbers or as a 
function of the variables A 1? A 2 ,..., X u . We have 

<k|AiA 2 ...A w > = <A X A 2 ...AJ6>, 

showing that the representative of a bra is the conjugate complex of the 
representative of the conjugate imaginary ket. In an orthogonal repre¬ 
sentation, where the basic bras are simultaneous eigenbras of a com¬ 
plete set of commuting observables, £ l5 £ 2j ... 5 say, the basic kets 
will be simultaneous eigenkets of ij v 

We have not yet considered the lengths of the basic vectors. With 
an orthogonal representation, the natural thing to do is to normalize 
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the basic vectors, rather than leave their lengths arbitrary, and so 
introduce a further stage of simplification into the representation. 
However, it is possible to normalize them only if the parameters 
which label them all take on discrete values. If any of these para¬ 
meters are continuous variables that can take on all values in a range, 
the basic vectors are eigenvectors of some observable belonging to 
eigenvalues in a range and are of infinite length, from the discussion 
in § 10 (see p. 39 and top of p. 40). Some other procedure is then 
needed to fix the numerical factors by which the basic vectors may 
be multiplied. To get a convenient method of handling this question 
a new mathematical notation is required, which will be given in the 
next section. 

15. The S function 

Our work in § 10 led us to consider quantities involving a certain 
kin d of infinity. To get a precise notation for dealing with these 
infinities, we introduce a quantity S(x) depending on a parameter x 
satisfying the conditions 

oo 

J S(x) dx = 1 

~co 

S(x) = 0 for x 0. 

To get a picture of S(a:), take a function of the real variable x which 
vanishes everywhere except inside a small domain, of length e say, 
surrounding the origin x = 0, and which is so large inside this domain 
that its integral over this domain is unity. The exact shape of the 
function inside this domain does not matter, provided there are no 
unnecessarily wild variations (for example provided the function 
is always of order e -1 ). Then in the limit e -> 0 this function will go 
over into S(#). 

8{x) is not a function of x according to the usual mathematical 
definition of a function, which requires a function to have a definite 
value for each point in its domain, but is something more general, 
which we may call an 'improper function 5 to show up its difference 
from a function defined by the usual definition. Thus 8(x) is not a 
quantity which can be generally used in mathematical analysis like 
an ordinary function, but its use must be confined to certain simple 
types of expression for which it is obvious that no inconsistency 
can arise. 
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The most important property of S(a:) is exemplified by the follow- 
mg equation, co 

J /(*)§(*) dx = /(0), (3) 

— 00 

where /( x) is any continuous function of z. We can easily see the 
validity of this equation from the above picture of S(x). The left- 
hand side of (3) can depend only on the values of f(x) very close 
to the origin, so that we may replace f(x) by its value at the origin, 
/( 0 ), without essential error. Equation ( 3 ) then follows from the 
first of equations ( 2 ). By making a change of origin in ( 3 ), we can 
deduce the formula „ 

/ f(x)B(x-a) dx = f{a), ( 4 ) 

where a is any real number. Thus the process of multiplying a function 
of x by S(£— a) and integrating over all x is equivalent to the process of 
substituting a for x. This general result holds also if the function of x is 
not a numerical one, but is a vector or linear operator depending on x. 

The range of integration in (3) and (4) need not be from —oo to oo, 
but may be over any domain surrounding the critical point at which 
the § function does not vanish. In future the limits of integration 
will usually be omitted in such equations, it being understood that 
the domain of integration is a suitable one. 

Equations (3) and (4) show that, although an improper function 
does not itself have a well-defined value, when it occurs as a factor 
in an integrand the integral has a well-defined value. In quantum 
theory, whenever an improper function appears, it will be something 
which is to be used ultimately in an integrand. Therefore it should be 
possible to rewrite the theory in a form in which the improper func¬ 
tions appear all through only in integrands. One could then eliminate 
the improper functions altogether. The use of improper functions 
thus does not involve any lack of rigour in the theory, but is merely 
a convenient notation, enabling us to express in a concise form 
certain relations which we could, if necessary, rewrite in a form not 
involving improper functions, but only in a cumbersome way which 
would tend to obscure the argument. 

An alternative way of defining the 8 function is as the differential 
coefficient e'(x) of the function e(rc) given by 

€ (#) = 0 (x < 0 ) 

= 1 (x > 0). 


( 5 ) 
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We may verify that this is equivalent to the previous definition by 
substituting e'(z) for 8(x) in the left-hand side of (3) and integrating 
by parts. We find, for g 1 and g 2 two positive numbers, 

01 

jf(x)e'{x) dx = [f(x)e(x)Y2 gi - j f(x)e(x) dx 
—02 - 

= f(9i)~ jf(x)dx 

= /( 0 ), 

in agreement with (3). The 8 function appears whenever one differen¬ 
tiates a discontinuous function. 

There are a number of elementary equations which one can write 
down about S functions. These equations are essentially rules of 
manipulation for algebraic work involving 8 functions. The meaning 
of any of these equations is that its two sides give equivalent results 
as factors in an integrand. 

Examples of such equations are 


(6) 

(7) 

( 8 ) 
( 9 ) 

( 10 ) 

( 11 ) 


S( — x) = 8(x) 
x8(x) — 0, 

8(ax) = a-^x) (a > 0), 

S(x 2 -a 2 ) = ia~i{8(x-a)+S(x-j-a)} (a > 0), 

J S(a—x) dx 8(x—b) = 8(a—b), 

f(x)8(x—a) = f(a)8(x-a). 

Equation ( 6 ), which merely states that 8(x) is an even function of its 
variable x is trivial. To verify (7) take any continuous function of 
x,f(x). Then 

J f(x)x8(x) dx — 0, 

from (3). Thus x8(x) as a factor in an integrand is equivalent to 
zero, which is just the meaning of (7). (8) and (9) may be verified 
by similar elementary arguments. To verify (10) take any continuous 
function of a,/(a). Then 

J /(a) da j 8 (a-x) dx 8(x-b) = J S (x-b) dx J /(a) da S(a-x) 

= J S(x-b) dxf(x) = J f(a) da 8(a—b). 

Thus the two sides of ( 10 ) are equivalent as factors in an integrand 
with a as variable of integration. It may be shown in the same way 
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that they are equivalent also as factors in an integrand with b as 
variable of integration, so that equation (10) is justified from either 
of these points of view. Equation (11) is also easily justified, with 
the help of (4), from two points of view. 

Equation (10) would be given by an application of (4) with 
f(x) = S(x—6). We have here an illustration of the fact that we may 
often use an improper function as though it were an ordinary con¬ 
tinuous function, without getting a wrong result. 

Equation (7) shows that, whenever one divides both sides of an 
equation by a variable x which can take on the value zero, one 
should add on to one side an arbitrary multiple of S(a;), i.e. from an 
equation A = B (12) 

one cannot infer A/x = B/x , 

but only A/x = B/x+cS(x), (13) 

where c is unknown. 

As an illustration of work with the 8 function, we may consider the 
differentiation of log x. The usual formula 

E log *-i (14) 

requires examination for the neighbourhood of x = 0. In order to 
make the reciprocal function \jx well defined in the neighbourhood 
of x — 0 (in the sense of an improper function) we must impose on 
it an extra condition, such as that its integral from — e to e vanishes. 
With this extra condition, the integral of the right-hand side of (14) 
from ~—e to € vanishes, while that of the left-hand side of (14) equals 
log (—■ 1), so that (14) is not a correct equation. To correct it, we must 
remember that, taking principal values, log# has a pure imaginary 
term in for negative values of x. As x passes through the value zero 
this pure imaginary term vanishes discontinuously. The differen¬ 
tiation of this pure imaginary term gives us the result —iir §(x), so 
that (14) should read 

-£log*-=i-»jr8(*). (15) 

ax x 

The particular combination of reciprocal function and 8 function 
appearing in (15) plays an important part in the quantum theory of 
collision processes (see § 50). 
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16. Properties of the basic vectors 
Using the notation of the S function, we can proceed with the theory 
of representations. Let us suppose first that we have a single observ¬ 
able £ forming by itself a complete commuting set, the condition for 
this being that there is only one eigenstate of £ belonging to any 
eigenvalue £', and let us set up an orthogonal representation in which 
the basic vectors are eigenvectors of £ and are written •(£'|, |f'>. 

In the case when the eigenvalues of £ are discrete, we can normalize 
the basic vectors, and we then have 

<f If > = 0 (£' ^ £"), 

<f If) = I- 

These equations can be combined into the single equation 

<f If > = 8^, (16) 

where the symbol 8 with two suffixes, which we shall often use in the 
future, has the meaning 

Ks — 0 when r ^ s 1 

= 1 when r = s. ) (U) 

In the case when the eigenvalues of £ are continuous we cannot 
normalize the basic vectors. If we now consider the quantity <f |£"> 
with £' fixed and f varying, we see from the work connected with 
expression (29) of § 10 that this quantity vanishes for £" £’ and 

that its integral over a range of £" extending through the value f 
is finite, equal to c say. Thus 

<fif> = c$(f-n. 

From (30) of § 10, c is a positive number. It may vary with £', so 
we should write it c(f) or c' for brevity, and thus we have 

<f|f> = c'8(f-r). ( 18 ) 

Alternatively, we have 

<f|f> = f8(f-0, (19) 

where c" is short for c(f), the right-hand sides of (18) and (19) beino- 

equal on account of (11). ** 

. Let us P ass to another representation whose basic vectors are 
eigenvectors of £, the new basic vectors being numerical multiples of 
the previous ones. Calling the new basic vectors <£'*|, ||'*> ; w jth the 
additional label * to distinguish them from the previous ones’, we have 

<f *1 = *'<f I, If *> = F|f>, 
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where k ' is short for k(£') and is a number depending on We get 

<r*ir*> = ir> - k'Fc f s(f-n 

with the help of (18). This may be written 

<nr*> = k’Wc'w-F) 

from (11). By choosing lc' so that its modulus is c'~*, which is possible 
since c' is positive, we arrange to have 


<nr*> - 8(r-r). ( 20 ) 

The lengths of the new basic vectors are now fixed so as to make the 
representation as simple as possible. The way these lengths were 
fixed is in some respects analogous to the normalizing of the basic 
vectors in the case of discrete equation (20) being of the form of 
(16) with the 8 function 8(£'-—1") replacing the 8 symbol of 
equation (16). We shall continue to work with the new representation 
and shall drop the * labels in it to save writing. Thus (20) will now 
be written <f ID = 8(f-f). (21) 

We can develop the theory on closely parallel lines for the discrete 
and continuous cases. For the discrete case we have, using (16), 

2 irxno = 2 = m, 

the sum being taken over all eigenvalues. This equation holds for 
any basic ket |0 and hence, since the basic kets form a complete set, 

2 irxn = i. (22) 

e 

This is a useful equation expressing an important property of the 
basic vectors, namely, if |£'> is multiplied on the right by <£'| the 
resulting linear operator , summed for all equals the unit operator . 
Equations (16) and (22) give the fundamental properties of the basic 
vectors for the discrete case. 

Similarly, for the continuous case we have, using (21), 

I If) df <f |f> = J |f> df S(f-f) = |f> (23) 

from (4) applied with a ket vector for f(x), the range of integration 
being the range of eigenvalues. This holds for any basic ket |£"> 
and hence 

/ If) df <f | = 1. (24) 
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This is of the same form as (22) with an integral replacing the sum. 
Equations (21) and (24) give the fundamental properties of the basic 
vectors for the continuous case. 

Equations (22) and (24) enable one to expand any bra or ket in 
terms of the basic vectors. For example, we get for the ket |P)> in the 
discrete case, by multiplying (22) on the right by |P>, 

|p> = 2irxri-p>. ( 25 ) 

t 

which gives |P> expanded in terms of the |£'>’s and shows that the 
coefficients in the expansion are <f'|P>, which are just the numbers 
forming the representative of |P>. Similarly, in the continuous case, 

|P> = J |f>df<f|P>, (26) 

giving |P> as an integral over the |£'>’s, with the coefficient in the 
integrand again just the representative <£'|P> of |P>. The conjugate 
imaginary equations to (25) and (26) would give the bra vector <P| 
expanded in terms of the basic bras. 

Our present mathematical methods enable us in the continuous 
case to expand any ket as an integral of eigenkets of £. If we do not 
use the S function notation, the expansion of a general ket will consist 
of an integral plus a sum, as in equation (25) of § 10, but the S function 
enables us to replace the sum by an integral in which the integrand 
consists of terms each containing a 8 function as a factor. For 
example, the eigenket ||"> may be replaced by an integral of eigen¬ 
kets, as is shown by the second of equations (23). 

If <(?| is any bra and |P> any ket we get, by further applications 
of (22) and (24), 

for discrete g‘' and 

for continuous These equations express the scalar product of 
and |P> in terms of their representatives <Q|£'> and <£'|P>. Equa¬ 
tion (27) is just the usual formula for the scalar product of two 
vectors in terms of the coordinates of the vectors, and (28) is the 
natural modification of this formula for the case of continuous fj\ 
with an integral instead of a sum. 

The generalization of the foregoing work to the case when f has 
both discrete and continuous eigenvalues is quite straightforward. 


<Q\P> = im'X£'\P> 

(27) 

<Q\P> = J <Clf> d£' <f|P> 

(28) 
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Using £ r and £ s to denote discrete eigenvalues and f' and to denote 
continuous eigenvalues, we have the set of equations 

\e> = <^ir> = o, <nr> = ne-n m 

as the generalization of (16) or (21). These equations express that 
the basic vectors are all orthogonal, that those belonging to discrete 
eigenvalues are normalized and those belonging to continuous eigen¬ 
values have their lengths fixed by the same rule as led to (20). From 
(29) we can derive, as the generalization of (22) or (24), 

xifxm f if>df <fi = i, (30) 

f' J 

the range of integration being the range of continuous eigenvalues. 
With the help of (30), we get immediately 

I P> = 2 If Xf IP> + f If) df <f I P> (31) 

as the generalization of (25) or (26), and 

<Q\P> = 2 <Q\?X£ r \P)+ f <Q\0 d£' <f |P> (32) 
e J 

as the generalization of (27) or (28). 

Let us now pass to the general case when we have several commuting 
observables £ 2 ,..., £ u forming a complete commuting set and set up 
an orthogonal representation in which the basic vectors are simul¬ 
taneous eigenvectors of all of them, and are written | 

Let us suppose ( v ^ u ) have discrete eigenvalues and 

£v+v have continuous eigenvalues. 

Consider the quantity f. + i-.f*ifi-f,« + i-fi;>. From the 

orthogonality theorem, it must vanish unless each for 

s = v+l By extending the work connected with expression 
(29) of § 10 to simultaneous eigenvectors of several commuting 
observables and extending also the axiom (30), we find that the 
(u— t;)-fold integral of this quantity with respect to each over 
a range extending through the value £' is a finite positive number. 
Calling this number c', the ' denoting that At is a function of 
fi,..,f.,f, +1 ,..,f«, we can express our results by the equation 

= c'8(£Ui-&+i)'M'u--£D> (33) 

with one 8 factor on the right-hand side for each value of s from 
v+ltou. We now change the lengths of our basic vectors so as to 

3505,57 w 
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make c' unity, by a procedure similar to that which led to (20). By 
a further use of the orthogonality theorem, we get finally 

(34) 

with a two-suffix 8 symbol on the right-hand side for each £ with 
discrete eigenvalues and a 8 function for each £ with continuous 
eigenvalues. This is the generalization of (16) or (21) to the case when 
there are several commuting observables in the complete set. 

From (34) we can derive, as the generalization of (22) or (24) 

2, /••/ l£-0 d£ +1 ..d? u <£...&! = l, (35) 

£v-€v 

the integral being a (u— v)-fold one over all the £ n s with continuous 
eigenvalues and the summation being over all the <f’s with discrete 
eigenvalues. Equations (34) and (35) give the fundamental properties 
of the basic vectors in the present case. Prom (35) we can imme¬ 
diately write down the generalization of (25) or (2b) and of (27) or (28). 

The ease we have just considered can be further generalized by 
allowing some of the £ 5 s to have both discrete and continuous eigen¬ 
values. The modifications required in the equations are quite straight¬ 
forward, but will not be given here as they are rather cumbersome to 
write down in general form. 

There are some problems in which it is convenient not to make the 
c' of equation (33) equal unity, but to make it equal to some definite 
function of the £ n s instead. Calling this function of the £"s p” 1 we 
then have, instead of (34) 

= p’-'h lfi-S«sS(& + i-K + i)-S(&-^), (36) 

and instead of (35) we get 

2 f..f I£...&>/>' d^.d^ <£...£( = 1. (37) 

p is called the weight function of the representation, p'd£' v+l ..d£' u 
being the 'weight 5 attached to a small volume element of the space 
of the variables 

The representations we considered previously all had the weight 
function unity. The introduction of a weight function not unity is 
entirely a matter of convenience and does not add anything to the 
mathematical power of the representation. The basic bras <£...&*! 
of a representation with the weight function p are connected with 
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the basic bras of the corresponding representation with the 

weight function unity by 

( 38 ) 

as is easily verified. An example of a useful representation with 
non-unit weight function occurs when one has two £’s which are 
the polar and azimuthal angles 9 and <f> giving a direction in three- 
dimensional space and one takes p = sin 9 f . One then has the element 
of solid angle s in O' dO'dfi occurring in (37). 

17. The representation of linear operators 

In § 14 we saw how to represent ket and bra vectors by sets of 
numbers. We now have to do the same for linear operators, in order 
to have a complete scheme for representing all our abstract quantities 
by sets of numbers. The same basic vectors that we had in § 14 can 
be used again for this purpose. 

Let us suppose the basic vectors are simultaneous eigenvectors of 
a complete set of commuting observables £ 1} £ 2j ..., If « is any 
linear operator, we take a general basic bra and a general 

basic ket |££...££> and form the numbers 

(39) 

These numbers are sufficient to determine a completely, since in the 
first place they determine the ket a|^...|"> (as they provide the 
representative of this ket), and the value of this ket for all the basic 
kets |££...££> determines a. The numbers (39) are called the repre¬ 
sentative, of the linear operator a or of the dynamical variable a. They 
are more complicated than the representative of a ket or bra vector 
in that they involve the parameters that label two basic vectors 
instead of one. 

Let us examine the form of these numbers in simple cases. Take 
first the case when there is only one f, forming a complete commuting 
set by itself, and suppose that it has discrete eigenvalues The 
representative of a is then the discrete set of numbers <£'|a|£">* If 
one had to write out these numbers explicitly, the natural way of 
arranging them would be as a two-dimensional array, thus: 

<a«ia> 

<&]*]?■> <a*i£ 2 > <£ 3 hp> 


( 40 ) 
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where f 1 ,^ 2 ,^ 8 ,.. are all the eigenvalues of £. Such an array is called 
a matrix and the numbers are called the dements of the matrix. We 
make the convention that the elements must always be arranged so 
that those in the same row refer to the same basic bra vector and 
those in the same column refer to the same basic ket vector. 

An element <£'|a|£'> referring to two basic vectors with the same 
label is called a diagonal element of the matrix, as all such elements 
lie on a diagonal. If we put cc equal to unity, we have from (10) all 
the diagonal elements equal to unity and all the other elements equal 
to zero. The matrix is then called the unit matrix . 

If a is real, we have 

<fn r> = <Pwf>- ( 41 ) 

The effect of these conditions on the matrix (40) is to make the 
diagonal elements all real and each of the other elements equal the 
conjugate complex of its mirror reflection in the diagonal. The matrix 
is then called a Hermitian matrix. 

If we put a equal to £, we get for a general element of the matrix 

<ri^ir> = f<fir> = ^8 lr . (42) 

Thus all the elements not on the diagonal are zero. The matrix is 
then called a diagonal matrix. Its diagonal elements are just equal 
to the eigenvalues of More generally, if we put <x equal to /(£), a 
function of we get 

( 4 ») 

and the matrix is again a diagonal matrix. 

Let us determine the representative of a product afl of two linear 
operators a and fi in terms of the representatives of the factors. 
From equation (22) with substituted for £' we obtain 

<fi^ir> = <fi«2ir><rij8ir> 

= i <e i«ir><fw>. (44) 

which gives us the required result. Equation (44) shows that the 
matrix formed by the elements <£'|«0|£*> equals the product of the 
matrices formed by the elements <£'MO and <£'|/8|£"> respectively, 
according to the usual mathematical rule for multiplying matrices. 
This rule gives for the element in the rth row and sth column of the 
product matrix the sum of the product of each element in the rth 
row of the first factor matrix with the corresponding element in the sth 
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column of the second factor matrix. The multiplication of matrices 
is non-commutative, like the multiplication of linear operators. 

We can summarize our results for the case when there is only one 
| and it has discrete eigenvalues as follows: 

(i) Any linear operatdr is represented by a matrix. 

(ii) The unit operator is represented by the unit matrix . 

(iii) A real linear operator is represented by a Hermitian matrix. 

(iv) £ and functions of £ are represented by diagonal matrices. 

(v) The matrix representing the product of two linear operators is the 
product of the matrices representing the two factors. 

Let us now consider the case when there is only one £ and it has 
continuous eigenvalues. The representative of a is now <|'|a|£">, a 
function of two variables and £" which can vary continuously. It 
is convenient to call such a function a ‘matrix 5 , using this word in 
a generalized sense, in order that we may be able to use the same 
terminology for the discrete and continuous cases. One of these 
generalized matrices cannot, of course, be written out as a two- 
dimensional array like an ordinary matrix, since the number of its 
rows and columns is an infinity equal to the number of points on a 
line, and the number of its elements is an infinity equal to the 
number of points in an area. 

We arrange our definitions concerning these generalized matrices 
so that the rules (i)-(v) which we had above for the discrete case 
hold also for the continuous case. The unit operator is represented 
by S(f-f') and the generalized matrix formed by these elements 
we define to be the unit matrix. We still have equation (41) as the 
condition for a to be real and we define the generalized matrix formed 
by the elements <£'|a|£"> to be Hermitian when it satisfies this 
condition. £ is represented by 

and f(i) by <f |/(f If> = /(f) S(f-f), (46) 

and the generalized matrices formed by these elements we define to be 
diagonal matrices. From (11), we could equally well have £ n and /(£") 
as the coefficients of S(£'~£") on the right-hand sides of (45) and (46) 
respectively. Corresponding to equation (44) we now have, from (24) 

<f W If) = / <f |« If") d£'" <f' H |f>, (47) 

with an integral instead of a sum, and we define the generalized 
matrix formed by the elements on the right-hand side here to be the 
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product of the matrices formed by <£'|a(|"> and <f'|With 
these definitions we secure complete parallelism between the discrete 
and continuous cases and we have the rules (i)-(v) holding for both. 

The question arises how a general diagonal matrix is to be defined 
in the continuous case, as so far we have only defined the right-hand 
sides of (45) and (46) to be examples of diagonal matrices. One 
might be inclined to define as diagonal any matrix whose (£',£") 
elements all vanish except when £' differs infinitely little from g", 
but this would not be satisfactory, because an important property 
of diagonal matrices in the discrete case is that they always commute 
with one another and we want this property to hold also in the 
continuous case. In order that the matrix formed by the elements 
<|'|a>(£") in the continuous case may commute with that formed by 
the elements on the right-hand side of (45) we must have, using the 
multiplication rule (47), 

/ <r mo drr s(r-n = / ew-n dr <rnr>- 

With the help of formula (4), this reduces to 

<rnr>r = mo m 

or (f-m'MO = 0. 

This gives, according to the rule by which (13) follows from (12), 

<i>o = c'S(f~n 

where c' is a number that may depend on £'. Thus <£' \w O is of the 
form of the right-hand side of (46). For this reason we define only 
matrices whose dements are of the form of the right-hand side of (46) to 
be diagonal matrices. It is easily verified that these matrices all 
commute with one another. One can form other matrices whose 
(£', £") elements all vanish when f differs appreciably from £" and 
have a different form of singularity when f equals f [we shall later 
introduce the derivative 8'(x) of the 8 function and 8'(f —£*) will 
then be an example, see § 22 equation (19)], but these other matrices 
are not diagonal according to the d efini tion. 

Let us now pass on to the case when there is only one £ and it has 
both discrete and continuous eigenvalues. Using (j r , £ 8 to denote 
discrete eigenvalues and if" to denote continuous eigenvalues, we 
now have the representative of a consisting of four kinds of quanti¬ 
ties, (f r lalf s ), <£ r |a|0, (£'lajr)- These quantities can all 
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be put together and considered to form a more general kind of matrix 
having some discrete rows and columns and also a continuous range 
of rows and columns. We define unit matrix, Hermitian matrix, 
diagonal matrix, and the product of two matrices also for this more 
general kind of matrix so as to make the rules (i)-(v) still hold. The 
details are a straightforward generalization of what has gone before 
and need not be given explicitly. 

Let us now go back to the general case of several ^’s,^,^,...,^. 
The representative of a, expression (39), may still be looked upon as 
forming a matrix, with rows corresponding to different values of 
£i,•••>£« and columns corresponding to different values of 
Unless all the £’s have discrete eigenvalues, this matrix will be of the 
generalized kind with continuous ranges of rows and columns. We 
again arrange our definitions so that the rules (i)-(v) hold, with rule 
(iv) generalized to: 

(iv') Each £ m (m = 1,2 and any function of them is repre¬ 
sented by a diagonal matrix. 

A diagonal matrix is now defined as one whose general element 
is of the form 

IS-£> = (49) 

in the case when £ 1} .., have discrete eigenvalues and £ M+1 ,.., £ u have 
continuous eigenvalues, c' being any function of the f n s. This defini¬ 
tion is the generalization of what we had with one f and makes 
diagonal matrices always commute with one another. The other 
definitions are straightforward and need not be given explicitly. 

We now have a linear operator always represented by a matrix. 
The sum of two linear operators is represented by the sum of the 
matrices representing the operators and this, together with rule (v), 
means that the matrices are subject to the same algebraic relations as 
the linear operators. If any algebraic equation holds between certain 
linear operators, the same equation must hold between the matrices 
representing those operators. 

The scheme of matrices can be extended to bring in the repre¬ 
sentatives of ket and bra vectors. The matrices representing linear 
operators are all square matrices with the same number of rows and 
columns, and with, in fact, a one-one correspondence between their 
rows and columns. We may look upon the representative of a ket 
|P> as a matrix with a single column by setting all the numbers 
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which form this representative one below the other. The 
number of rows in this matrix will be the same as the number of 
rows or columns in the square matrices representing linear operators. 
Such a single-column matrix can be multiplied on the left by a square 
matrix representing a linear operator, by a rule 

similar to that for the multiplication of two square matrices. The 
product is another single-column matrix with elements given by 

2 f..f 

From (35) this is just equal to P>, the. representative of 

a|P>. Similarly we may look upon the representative of a bra <Q| 
as a matrix with a single row by setting all the numbers 
side by side. Such a single-row matrix may be multiplied on the 
right by a square matrix the product being another 

single-row matrix, which is just the representative of (Q\oc. The 
single-row matrix representing <$| may be multiplied on the right 
by the single-column matrix representing |P>, the product being a 
matrix with just a single element, which is equal to <Q|P>. Finally, 
the single-row matrix representing < Q\ may be multiplied on the left 
by the single-column matrix representing |P>, the product being a 
square matrix, which is just the representative of |P><Q|. In this 
way all our abstract symbols, linear operators, bra vectors, and ket 
vectors, can be represented by matrices, which are subject to the 
same algebraic relations as the abstract symbols themselves. 

18. Probability amplitudes 

Representations are of great importance in the physical interpreta¬ 
tion of quantum mechanics as they provide a convenient method for 
obtaining the probabilities of observables having given values. In 
§ 12 we obtained the probability of an observable having any speci¬ 
fied value for a given state and in § 13 we generalized this result 
and obtained the probability of a set of commuting observables 
simultaneously having specified values for a given state. .Let us now 
apply this result to a complete set of commuting observables, say the 
set of fs which we have been dealing with already. According to 
formula (51) of § 13, the probability of each having the value 
for the state corresponding to the normalized ket vector |a?> is 


(50) 
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If the |’s all have discrete eigenvalues, we can use (35) with v = u 
and no integrals, and get 

(l.Ju 

= <*iri...&xrr-r*i*> 

= ( 51 ) 

We thus get the simple result that the probability of the fs having the 
values f is just the square of the modulus of the appropriate coordinate 
of the normalized ket vector corresponding to the state concerned . 

If the fa do not all have discrete eigenvalues, but if, say, £ lr .,£ v 
have discrete eigenvalues and £ w+1 ,.., % u have continuous eigenvalues,, 
then to get something physically significant we must obtain the 
probability of each £ r (r = 1,.., v) having a specified value and each 
£ s (s = v+l,..,u) lying in a specified small range f 8 to f s +df s . For 
this purpose -we must replace each factor S^; in (50) by a factor 
which is that function of the observable £ s which is equal to unity 
for £ s within the range f 8 to £ s +dg s and zero otherwise. Proceeding 
as before with the help of (35), we obtain for this probability 

vMu = ( 52 > 

Thus in every case the probability distribution of values for the fs is 
given by the square of the modulus of the representative of the norma¬ 
lized ket vector corresponding to the state concerned. 

The numbers which form the representative of a normalized ket 
(or bra) may for this reason be called probability amplitudes. The 
square of the modulus of a probability amplitude is an ordinary 
probability, or a probability per unit range for those variables that 
have continuous ranges of values. 

We may be interested in a state whose corresponding ket \x) cannot 
be normalized. This occurs, for example, if the state is an eigenstate 
of some observable belonging to an eigenvalue lying in a range of 
eigenvalues. The formula (51) or (52) can then still be used to give 
the relative probability of the fs having specified values or having 
values lying in specified small ranges, i.e. it will give correctly the 
ratios of the probabilities for different f’s. The numbers <^...^|^> 
may then be called relative probability amplitudes . 
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The representation for which the above results hold is characterized 
by the basic vectors being simultaneous eigenvectors of all the £’s. 
It may also be characterized by the requirement that each of the £’s 
shall be represented by a diagonal matrix, this condition being easily 
seen to be equivalent to the previous one. The latter characterization 
is usually the more convenient one. For brevity, we shall formulate 
it as each of the |’s 'being diagonal in the representation \ 

Provided the |’s form a complete set of commuting observables, 
the representation is completely determined by the characterization, 
apart from arbitrary phase factors in the basic vectors. Each basic bra 
may be multiplied by e% where y is any real function of 
the variables £' u , without changing any of the conditions which 
the representation has to satisfy, i.e. the condition that the |’s are 
diagonal or that the basic vectors are simultaneous eigenvectors of 
the fs, and the fundamental properties of the basic vectors (34) and 
(35). With the basic bras changed in this way, the representative 
(£'v-£u\Py of a ket |P> gets multiplied by e*/, the representative 
of a bra <0| gets multiplied by and the representa¬ 
tive of a linear operator a gets multiplied by 

The probabilities or relative probabilities (51), (52) are, of course, 
unaltered. 

The probabilities that one calculates in practical problems in 
quantum mechanics are nearly always obtained from the squares 
of the moduli of probability amplitudes or relative probability ampli¬ 
tudes. Even when one is interested only in the probability of an 
incomplete set of commuting observables having specified values, it 
is usually necessary first to make the set a complete one by the 
introduction of some extra commuting observables and to obtain 
the probability of the complete set having specified values (as the 
square of the modulus of a probability amplitude), and then to sum 
or integrate over all possible values of the extra observables. A 
more direct application of formula (51) of § 13 is usually not 
practicable. 

To introduce a representation in practice 

(i) We look for observables which we would like to have diagonal, 
either because we are interested in their probabilities or for 
reasons of mathematical simplicity; 

(ii) We must see that they all commute—a necessary condition 
since diagonal matrices always commute; 
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(iii) We then see that they form a complete commuting set, and 
if not we add some more commuting observables to them to 
make them into a complete commuting set; 

(iv) We set up an orthogonal representation with this complete 
commuting set diagonal. 

The representation is then completely determined except for the 
arbitrary phase factors. For most purposes the arbitrary phase 
factors are unimportant and trivial, so that we may count the 
representation as being completely determined by the observables 
that are diagonal in it. This fact is already implied in our notation, 
since the only indication in a representative of the representation to 
which it belongs are the letters denoting the observables that are 
diagonal. 

It may be that we are interested in two representations for the 
same dynamical system. Suppose that in one of them the complete 
set of commuting observables £ l3 ...,£ u are diagonal and the basic 
bras are and in the other the complete set of commuting 

observables Tfo, are diagonal and the basic bras are 

A ket |P> will now have the two representatives P> and 

<^...^|P>. If £ v ..,£ v have discrete eigenvalues and £ v+1 ,,. } £ u have 
continuous eigenvalues and if rj l3 „, r\ x have discrete eigenvalues and 
Vw have continuous eigenvalues, we get from (35) 

Wv-Vw\P>= 2 (53) 

£j; j J 

and interchanging £’s and rj’s 

<&~fjp>= I (“> 

These are the transformation equations which give one representative 
of |P> in terms of the other. They show that either representative 
is expressible linearly in terms of the other, with the quantities 

<v'i-v'w l£-&>> <&—(55) 

as coefficients. These quantities are called the transformation func¬ 
tions . Similar equations may be written down to connect the two 
representatives of a bra vector or of a linear operator. The trans¬ 
formation functions (55) are in every case the means which enable 
one to pass from one representative to the other. Each of the 
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transformation functions is the conjugate complex of the other, and 
they satisfy the conditions 

I f~f 

fi-S J } 

= ^Wx-^-oWx^+i—rix+i )-HVw~v"w) (56) 

and the corresponding conditions with £’s and rj’s interclianged, as 
may be verified from (35) and (34) and the corresponding equations 
for the 7 ]’s. 

Transformation functions are examples of probability amplitudes 
or relative probability amplitudes. Let us take the case when all the 
|’s and all the rf s have discrete eigenvalues. Then the basic ket 
Wv-Vw} is normalized, so that its representative in the ^-representa¬ 
tion, is a probability amplitude for each set of values 

for the f’s. The state to which these probability amplitudes refer, 
namely the state corresponding to |%...^>, is characterized by the 
condition that a simultaneous measurement of r) v ... 9 rj w is certain to 
lead to the results r} w . Thus is the proba¬ 

bility of the £ 9 s having the values £'i>..£ u for the state for which the 
V s certainly have the values 7 ^... 77 ,^. Since 

\<£v-4uWv"Vw>\ 2 = 

we have the theorem of reciprocity —the probability of the £’s having 
the values for the state for which the 7 fs certainly have the values rf 
is equal to the probability of the rfs having the values 7 ?' for the state for 
which the £’s certainly have the values 

If all the t/s have discrete eigenvalues and some of the £’s have 
continuous eigenvalues, still gives the probability 

distribution of values for the £’s for the state for which the 7 ]’s cer¬ 
tainly have the values 77 '. If some of the t/s have continuous eigen¬ 
values, is not normalized and {(£[-•-£' u \r}' 1 ...r) , w }\ 2 then gives 

only the relative probability distribution of values for the f s for the 
state for which the t/s certainly have the values 77 '. 

19. Theorems about functions of observables 

We shall illustrate the mathematical value of representations by 
using them to prove some theorems. 

Theorem 1. A linear operator that commutes with an observable £ 
commutes also with any function of £. 

The theorem is obviously true when the function is expressible as 
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a power series. To prove it generally, let oj be the linear operator, 
so that we have the equation 

geo—cog — 0. (57) 

Let us introduce a representation in which g is diagonal. If g by 
itself does not form a complete commuting set of observables, we must 
make it into a complete commuting set by adding certain observables, 
say, to it, and then take the representation in which g and the /3’s 
are diagonal. (The case when g does form a complete commuting set 
by itself can be looked upon as a special case of the preceding one 
with the number of j8 variables zero.) In this representation equation 
(57) becomes = 0, 

which reduces to 

= o. 

In the case when the eigenvalues of g are discrete, this equation 
shows that all the matrix elements <£'/?'|a>|£70 of o> vanish except 
those for which g f = g". In the case when the eigenvalues of g are 
continuous it shows, like equation (48), that is of the 

form <f/3>o"> = cS(f-n, 

where c is some function of g' and the /3' 5 s and /?"’s. In either case 
we may say that the matrix representing co ‘is diagonal with respect 
tog’. If/(|) denotes any function of £ in accordance with the general 
theory of § 11, which requires /(£'") to be defined for g'" any eigenvalue 
of g 3 we can deduce in either case 

= o. 

This gives <£’? |/(£) e>-a>/(f) |f'£"> - 0, 

so that f(g)a—wf(g) = 0 

and the theorem is proved. 

As a special case of the theorem, we have the result that any 
observable that commutes with an observable g also commutes with 
any function of g. This result appears as a physical necessity when 
we identify, as in § 13, the condition of commutability of two 
observables with the condition of compatibility of the correspond¬ 
ing observations. Any observation that is compatible with the 
measurement of an observable g must also be compatible with the 
measurement of f(g), since any measurement of g includes in itself 
a measurement of f(g). 
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Theorem 2. A linear operator that commutes with each of a complete 
set of commuting observables is a function of those observables. 

Let co be the linear operator and £ x ,£ 2 complete set of 
commuting observables, and set up a representation with these 
observables diagonal. Since co commutes with each of the £’s, the 
matrix representing it is diagonal with respect to each of the £’s, 
by the arg um ent we had above. This matrix is therefore a diagonal 
matrix and is of the form (49), involving a number c' which is a 
function of the £ /5 s. It thus represents the function of the £’s that 
c' is of the £ n s, and hence co equals this function of the £’s. 

Theorem 3. If an observable £ and a linear operator g are such that 
any linear operator that commutes with £ also commutes with g, then g 
is a function of £. 

This is the converse of Theorem 1. To prove it, we use the same 
representation with £ diagonal as we had for Theorem 1. In the first 
place, we see that g must commute with £ itself, and hence the 
representative of g must be diagonal with respect to £, i.e. it must 
be of the form 

<fir|g|£T> = W% or aiep'p'M'-n, 

according to whether £ has discrete or continuous eigenvalues. Now 
let co be any linear operator that commutes with £, so that its 
representative is of the form 

<£'jS>|£T> - M'PP’fyp or 6(£'/3 , /3")S(£'—£"). 

By hypothesis co must also commute with g , so that 

<£^|^”^im = 0. (58) 

If we suppose for definiteness that the j8’s have discrete eigenvalues, 
(58) leads, with the help of the law of matrix multiplication, to 

2 {a(t'p'nHern-Hep'n*(?rn} = o. (59) 

j3'" 

the left-hand side of (58) being equal to the left-hand side of (59) 
multiplied by 8^ or 8(£'—■£"). Equation (59) must hold for all 
functions 6(£'jS'jS"). We can deduce that 

a(£'j8'jB') = 0 for F # /?", 
a(fj8'jS') = a(£ r m. 

The first of these results shows that the matrix representing g is 
diagonal and the second shows that 3') is a function of £' only. 
We can now infer that g is that function of £ which a(£'P'fi') is of 
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so the theorem is proved. The proof is analogous if some of the fi’s 
have continuous eigenvalues. 

Theorems 1 and 3 are still valid if we replace the observable g by 
any set of commuting observables £ 25 ..,f r , only formal changes 
being needed in the proofs. 


20. Developments in notation 

The theory of representations that we have developed provides a 
general system for labelling kets and bras. In a representation in which 
the complete set of commuting observables g^..., g u are diagonal any 
ket |P> will have a representative or <£'|P> for brevity. 

This representative is a definite function of the variables g', say if/(g'). 
The function if/ then determines the ket |P> completely, so it may be 
used to label this ket, to replace the arbitrary label P. In symbols, 

if <f'| P> «*(*') 

we put | P> = |^(f)>. 

We must put |P> equal to \*ft(g)y and not ji/r(|')>, since it does not 
depend on a particular set of eigenvalues for the g’s, but only on the 
form of the function if/. 

With f(g) any function of the observables g v ...,g u , f(g)\P} will 
have as its representative 

<£'\m\p>=mm- 


(60) 


Thus according to (60) we put 

m\p> = mm&>- 

With the help of the second of equations (60) we now get 

= («) 

This is a general result holding for any functions / and if/ of the g’s, 
and it shows that the vertical line | is not necessary with the new 
notation for a ket—either side of (61) may be written simply as 
/(f)</r(f)>. Thus the rule for the new notation becomes:— 

if <?\p> = m (62) 

we put |P> = 0(f )>• 

We may further shorten if/(g)y to ifiy, leaving the variables g under¬ 
stood, if no ambiguity arises thereby. 

The ket ift(g)} may be considered as the product of the linear 
operator ifj(g) with a ket which is denoted simply by > without a 
label. We call the ket ) the standard ket . Any ket whatever can be 
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expressed as a function of the £’s multiplied into the standard ket. 
For example, taking |P> in (62) to be the basic ket |£">, we find 

If > = Sfo+i-fi+iMfo-Q) (63) 

in the case when | l5 .., £ v have discrete eigenvalues and £ v + lf ~,£ u have 
continuous eigenvalues. The standard ket is characterized by the 
condition that its representative {£'{) is unity over the whole domain 
of the variable £', as may be seen by putting ft = 1 in (62). 

A further contraction may be made in the notation, namely to 
leave the symbol > for the standard ket understood. A ket is then 
written simply as ft{£), a function of the observables £. A function 
of the £’s used in this way to denote a ket is called a wave function, f 
The system of notation provided by wave functions is the one usually 
used by most authors for calculations in quantum mechanics. In 
using it one should remember that each wave function is understood 
to have the standard ket multiplied into it on the right, which 
prevents one from multiplying the wave function by any operator 
on the right. Wave functions can be multiplied by operators only on 
the left. This distinguishes them from ordinary functions of the £’s, 
which are operators and can be multiplied by operators on either the 
left or the right. A wave function is just the representative of a ket 
expressed as a function of the observables £, instead of eigenvalues £' 
for those observables. The square of its modulus gives the proba¬ 
bility (or the relative probability, if it is not normalized) of the £’s 
having specified values, or lying in specified small ranges, for the 
corresponding state. 

The new notation for bras may be developed in the same way as 
for kets. A bra <Q| whose representative < Q \£'> is ft(£') we write 
<<£(£')|. With this notation the conjugate imaginary to \ft(£)} is 
<$(£)]. Thus the rule that we have used hitherto, that a ket and 
its conjugate imaginary bra are both specified by the same label, 
must be extended to read— if the labels of a ket involve complex 
numbers or complex functions , the labels of the conjugate imaginary 
bra involve the conjugate complex numbers or functions. As in the 
case of kets we can show that < ft(£)\f(£ ) and (ft{£)f(£)\ are the same, 
so that the vertical line can be omitted. We can consider (ft(£) as 
the product of the linear operator ft(£) into the standard bra which 

f The reason for this name, is that in the early days of quantum mechanics all the 
examples of these functions were of the form of waves. The name is not a descriptive 
one from the point of view of the modern general theory. 
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is the conjugate imaginary of the standard ket >. We may leave 
the standard bra understood, so that a general bra is written as </>(£), 
the conjugate complex of a wave function. The conjugate complex 
of a wave function can be multiplied by any linear operator on the 
right, but cannot be multiplied by a linear operator on the left. We 
can construct triple products of the form </(£)>. Such a triple product 
is a number, equal to /(£) summed or integrated over the w r hole 
domain of eigenvalues for the £’s, 

</(£»= 2 (64) 

in the case when g v .., have discrete eigenvalues and have 

continuous eigenvalues. 

The standard ket and bra are defined with respect to a representa¬ 
tion. If we carried through the above work with a different repre¬ 
sentation in which the complete set of commuting observables 77 are 
diagonal, or if we merely changed the phase factors in the representa¬ 
tion with the £’s diagonal, we should get a different standard ket and 
bra. In a piece of work in which more than one standard ket or bra 
appears one must, of course, distinguish them by giving them labels. 

A further development of the notation which is of great importance 
for dealing with complicated dynamical systems will now be discussed. 
Suppose we have a dynamical system describable in terms of dynami¬ 
cal variables which can all be divided into two sets, set A and set B 
say, such that any member of set A commutes with any member of 
set B. A general dynamical variable must be expressible as a function 
of the A-variables and B -variables together. We may consider 
another dynamical system in which the dynamical variables are the 
A-variables only—let us call it the A-system. Similarly we may 
consider a third dynamical system in which the dynamical variables 
are the 5-variables only—the 5-system. The original system can 
then be looked upon as a combination of the A-system and the 
5-system in accordance with the mathematical scheme given below. 

Let us take any ket \a) for the A-system and any ket |6> for the 
5-system. We assume that they have a product |a>|6> for which 
the commutative and distributive axioms of multiplication hold, i.e. 

|a>|6> = |6>|a>, 

{ci|a 1 >+c 2 |u 2 >}|6> = c 1 |a 1 >|6>+c 2 |a 2 >|6>, 

\ay{c x |&x)>+c 2 1&2/ > }. == c i\ a y $ 2 ), 


3595.57 
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the c’s being n um bers. We can give a meaning to any ^-variable 
operating on the product |a)|6) by assuming that it operates only 
on the |a) factor and commutes with the |6> factor, and similarly 
we can give a meaning to any 5-variable operating on this product 
by assuming that it operates only on the |6> factor and commutes 
with the | a) factor. (This makes every A-variable commute with 
every 5-variable.) Thus any dynamical variable of the original 
system can operate on the product |a)|6), so this product can be 
looked upon as a ket for the original system, and may then be 
written \ab}, the two labels a and b being sufficient to specify it. 
In this way we get the fundamental equations 

|a>|6> = |6>|a>= \ab>. (65) 

The multiplication here is of quite a different kind from any that 
occurs earlier in the theory. The ket vectors | a} and |6> are in two 
different vector spaces and their product is in a third vector space, 
which may be called the product of the two previous vector spaces. 
The number of dimensions of the product space is equal to the 
product of the number of dimensions of each of the factor spaces. 
A general ket vector of the product space is not of the form (65), but 
is a sum or integral of kets of this form. 

Let us take a representation for the A-system in which a complete 
set of commuting observables £ a of the A -system are diagonal. We 
shall then have the basic bras (£ a j for the A-system. Similarly, taking 
a representation for the 5-system with the observables £ B diagonal, 
we shall have the basic bras for the .5-system. The products 

<£a\<£b\ = <&&\ ( 66 ) 

will then provide the basic bras for a representation for the original 
system, in which representation the £ A s and the £ B s will be diagonal. 
The £ A s and £ B ’s will together form a complete set of commuting 
observables for the original system. From (65) and (66) we get 

(67) 

showing that the representative of | ab) equals the product of the 
representatives of | a) and of |6> in their respective representations. 

We can introduce the standard ket, ) A say, for the A-system, 
with respect to the representation with the £ A s diagonal, and also 
the standard ket y B for the 5-system, with respect to the repre¬ 
sentation with the £ b s diagonal. Their product ) A > J& is then the 
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standard ket for the original system, with respect to the representa¬ 
tion with the g A ’s and g B ’s diagonal. Any ket for the original system 


may be expressed as 


< A (£a £bY)a ) b • 


( 68 ) 


It may be that in a certain calculation we wish to use a particular 
representation for the 5-system, say the above representation with 
the £ b ’s diagonal, but do not wish to introduce any particular 
representation for the A-system. It would then be convenient to 
use the standard ket } B for the 5-system and no standard ket for 
the A-system. Under these circumstances we could write any ket 


for the original system as 




(69) 


in which (£ B > is a ket for the A-system and is also a function of the 
( B % i.e. it is a ket for the A-system for each set of values for the 
i B s —in fact (69) equals (68) if we take 


I £b) — 


We may leave the standard ket in (69) understood, and then we 
have the general ket for the original system appearing as |f#>, a ket 
for the A-system and a wave function in the variables g B of the 
5-system. Examples of this notation will be used in §§ 66 and 79. 

The above work can be immediately extended to a dynamical 
system describable in terms of dynamical variables -which can be 
divided into three or more sets A, 5, G,... such that any member of 
one set commutes with any member of another. Equation (65) gets 
generalized to | a >|6>|c>... = \abc...>, 


the factors on the left being kets for the component systems and 
the ket on the right being a ket for the original system. Equations 
(66), (67), and (68) get generalized to many factors in a similar way. 
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21. Poisson brackets 

Otjb work so far has consisted in setting up a general mathematical 
scheme connecting states and observables in quantum mechanics. 
One of the dominant features of this scheme is that observables, and 
dynamical variables in general, appear in it as quantities which do 
not obey the commutative law of multiplication. It now becomes 
necessary for us to obtain equations to replace the commutative law 
of multiplication, equations that will tell us the value of when 

£ and 7 ] are any two observables or dynamical variables. Only when 
such equations are known shall we have a complete scheme of 
mechanics with which to replace classical mechanics. These new 
equations are called quantum conditions or commutation relations . 

The problem of finding quantum conditions is not of such a general 
character as those we have been concerned with up to the present. It 
is instead a special problem which presents itself with each particular 
dynamical system one is called upon to study. There is, however, 
a fairly general method of obtaining quantum conditions, applicable 
to a very large class of dynamical systems. This is the method of 
classical analogy and will form the main theme of the present chapter. 
Those dynamical systems to which this method is not applicable 
must be treated individually and special considerations used in each 
case. 

The value of classical analogy in the development of quantum 
mechanics depends on the fact that classical mechanics provides a 
valid description of dynamical systems under certain conditions, 
when the particles and bodies composing the systems are sufficiently 
massive for the disturbance accompanying an observation to be 
negligible. Classical mechanics must therefore be a limiting case of 
quantum mechanics. We should thus expect to find that important 
concepts in classical mechanics correspond to important concepts in 
quantum mechanics, and, from an understanding of the general 
nature of the analogy between classical and quantum mechanics, we 
may hope to get laws and theorems in quantum mechanics appearing 
as simple generalizations of well-known results in classical mechanics; 
in particular we may hope to get the quantum conditions appearing 
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as a simple generalization of the classical law that all dynamical 
variables commute. 

Let us take a dynamical system composed of a number of particles 
in interaction. As independent dynamical variables for dealing with 
the system we may use the Cartesian coordinates of all the particles 
and the corresponding Cartesian components of velocity of the par¬ 
ticles. It is, however, more convenient to work with the momentum 
components instead of the velocity components. Let us call the 
coordinates q r> r going from 1 to three times the number of particles, 
and the corresponding momentum components p r . The q ’s and p ’s 
are called canonical coordinates and momenta. 

The method of Lagrange’s equations of motion involves introdu¬ 
cing coordinates q r and momenta p r in a more general way, applicable 
also for a system not composed of particles (e.g. a system containing 
rigid bodies). These more general g’s and _p’s are also called canonical 
coordinates and momenta. Any dynamical variable is expressible in 
terms of a set of canonical coordinates and momenta. 

An important concept in general dynamical theory is the Poisson 
Bracket. Any two dynamical variables u and v have a P.B. (Poisson 
Bracket) which we shall denote by \u,v\ defined by 



du dv 

dq r dp r 


du dv 
Wr^r. 


( 1 ) 


u and v being regarded as functions of a set of canonical coordinates 
and momenta q r and p r for the purpose of the differentiations. The 
right-hand side of (1) is independent of which set of canonical 
coordinates and momenta are used, this being a consequence of the 
general definition of canonical coordinates and momenta, so the 
P.B. [■ u,v ] is well defined. 

The main properties of P.B.’s, which follow at once from their 
definition (1), are 

[u,v] = —[«,%], (2) 

[u,c] = 0, (3) 

where c is a number (which may be considered as a special case of a 
dynamical variable), 

[u-i+u^v] = [u v v]+[u 2 ,v\, 

[u,v = [u,v{\+[u,v 2 ], 


(4) 
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= ) (5) 

[u, V, V 2 ] = [u, V^Vz+V! [u, v 2 ]. I 

Also the identity 

[u, [v, w]]+[v, [■ w, u]]+[w, [u, v]] = 0 (6) 

is easily verified. Equations (4) express that the P.B. [u, v] involves 
u and v linearly, while equations (5) correspond to the ordinary rules 
for differentiating a product. 

Let us try to introduce a quantum P.B. which shall be the analogue 
of the classical one. We assume the quantum P.B. to satisfy all the 
conditions (2) to (6), it being now necessary that the order of the 
factors u x and u 2 in the first of equations (5) should be preserved 
throughout the equation, as in the way we have here written it, and 
similarly for the v x and v 2 in the second of equations (5). These condi¬ 
tions are already sufficient to determine the form of the quantum 
P.B. uniquely, as may be seen from the following argument. We can 
evaluate the P.B. [%w 2 , v x v 2 ] in two different ways, since we can use 
either of the two formulas (5) first, thus, 


[u^v-lV^ = [u v V 1 v 2 ]w 2 +%0 2 , v 1 i> 2 ] 

= {[%, «lK + «lK: « 2 ]}w 2 + «i{(>2, «l]«2+' ( ’l[ W 2> « 2 ]} 

= [ M l> »lK «2+»l[%, V^u^u^, + w s] 

and 

[u x u 2) v x v 2 ] = 

= [Ui, ^ 1 ]^2 V 2 J r U l[ U 2^ + V 2\ U Z J r V l ^ i |> 2 > %]• 

Equating these two results, we obtain 


v 2 u 2 ) = {u 1 v l --v 1 u 1 )[u< l ,v 2 ]. 

Since this condition holds with u x and v x quite independent of u 2 and 
Vo, we must have .> r n 

25 U X V X —V X U X = flj, 

u %U 2 = ih[u 2 , v 2 ], 

where & must not depend on u x and v x , nor on u 2 and v 2 , and also 
must commute with (u x v x —v x u x ). It follow's that % must be simply 
a number. We want the P.B. of two real variables to be real, as in 
the classical theory, which requires, from the work at the top of p. 28, 
that H shall be a real number when introduced, as here, with the 
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coefficient i. We are thus led to the following definition for the 
quantum P.B. [w, v\ of any two variables u and v, 

"uv—vu = ift[u,v], (7) 

in which h is a new universal constant. It has the dimensions of 
action. In order that the theory may agree with experiment, we 
must take h equal to^h[2n, where h is the universal constant that 
was introduced by Planck, known as Planck’s constant. It is easily 
verified that the quantum P.B. satisfies all the conditions ( 2 ), ( 3 ), (4), 
(5), and ( 6 ). 

The problem of finding quantum conditions now reduces to the 
problem of determining P.B.’s in quantum mechanics. The strong 
analogy between the quantum P.B. defined by (7) and the classical 
P.B. defined by ( 1 ) leads us to make the assumption that the quantum 
P.B.’s, or at any rate the simpler ones of them, have the same values 
as the corresponding classical P.B.’s. The simplest P.B.’s are those 
involving the canonical coordinates and momenta themselves and 
have the following values in the classical theory: 

[&.&] = 0> [Pr>Ps] = 0> (8) 

We therefore assume that the corresponding quantum P.B.’s also 
have the values given by ( 8 ). By efiminating the quantum P.B.’s 
with the help of (7), we obtain the equations 

PrPs-PsPr = 

<lrPs-P S <lr = ^ rs , 

which are the fundamental quantum conditions. They show us where 
the lack of commutability among the canonical coordinates and 
momenta lies. They also provide us with a basis for calculating com¬ 
mutation relations between other dynamical variables. For instance, 
if £ and 77 are any two functions of the g’s and p's expressible as 
power series, we may express 67 —or [£, 77 ], by repeated applica¬ 
tions of the laws (2), (3), (4), and (5), in terms of the elementary 
P.B.’s given in ( 8 ) and so evaluate it. The result is often, in simple 
cases, the same as the classical result, or departs from the classical 
result only through requiring a special order for factors in a product, 
this order being, of course, unimportant in the classical theory. Even 
when £ and 77 are more general functions of the q's and p's not ex¬ 
pressible as power series, equations (9) are still sufficient to fix the 
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value of gq—rji, as will become clear from the following work. 
Equations (9) thus give the solution of the problem of finding the 
quantum conditions, for all those dynamical systems which have a 
classical analogue and which are describable in terms of canonical 
coordinates and momenta. This does not include all possible systems 
in quantum mechanics. 

Equations (7) and (9) provide the foundation for the analogy 
between quantum mechanics and classical mechanics. They show 
that classical mechanics may he regarded as the limiting case of quantum 
mechanics when h tends to zero. A P.B. in quantum mechanics is a 
purely algebraic notion and is thus a rather more fundamental con¬ 
cept than a classical P.B., which can be defined only with reference to 
a set of canonical coordinates and momenta. For this reason canonical 
coordinates and momenta are of less importance in quantum mechanics 
than in classical mechanics; in fact, we may have a system in quan¬ 
tum mechanics for which canonical coordinates and momenta do 
not exist and we can still give a meaning to P.B.’s. Such a system 
would be one without a classical analogue and we should not be able 
to obtain its quantum conditions by the method here described. 

From equations (9) we see that two variables with different suffixes 
r and s always commute. It follows that any function of q r and p r 
will commute with any function of q s and p s when s differs from r. 
Different values of r correspond to different degrees of freedom of the 
dynamical system, so we get the result that dynamical variables 
referring to different degrees of freedom commute. This law, as we have 
derived it from (9), is proved only for dynamical systems with 
classical analogues, but we assume it to hold generally. In this way 
we can make a start on the problem of finding quantum conditions 
for dynamical systems for which canonical coordinates and momenta 
do not exist, provided we can give a meaning to different degrees of 
freedom, as we may be able to do with the help of physical insight. 

We can now see the physical meaning of the division, which was 
discussed in the preceding section, of the dynamical variables into 
sets, any member of one set commuting with any member of another. 
Each set corresponds to certain degrees of freedom, or possibly just 
one degree of freedom. The division may correspond to the physical 
process of resolving the dynamical system into its constituent parts, 
each constituent being capable of existing by itself as a physical 
system, and the various constituents having to be brought into 



POISSON BRACKETS 


89 


§ 21 

interaction with one another to produce the original system. Alterna¬ 
tively the division may be merely a mathematical procedure of 
resolving the dynamical system into degrees of freedom which cannot 
be separated physically, e.g. the system consisting of a particle with 
internal structure may be divided into the degrees of freedom describ¬ 
ing the motion of the centre of the particle and those describing the 
internal structure. 

22. Schrodinger’s representation 

Let us consider a dynamical system with n degrees of freedom 
having a classical analogue, and thus describable in terms of canonical 
coordinates and momenta q r) p T (r = 1, 2,...,?i). We assume that the 
coordinates q r are all observables and have continuous ranges of eigen¬ 
values , these assumptions being reasonable from the physical signifi¬ 
cance of the qf s. Let us set up a representation with the g’s diagonal. 
The question arises whether the q’ s form a complete commuting set 
for this dynamical system. It seems pretty obvious from inspection 
that they do. We shall here assume that they do, and the assumption 
will be justified later (see top of p. 92). With the q’s forming a 
complete commuting set, the representation is fixed except for the 
arbitrary phase factors in it. 

Let us consider first the case of n = 1, so that there is only one q 
and p, satisfying qp _ pq = ^ (10) 

Any ket may be written in the standard ket notation From it 

we can form another ket dip/dq}, whose representative is the deriva¬ 
tive of the original one. This new ket is a linear function of the 
original one and is thus the result of some linear operator applied to 
the original one. Calling this linear operator d/dq, we have 

!*>-!>• < n > 

Equation (11) holding for all functions i/j defines the linear operator 
d/dq. We have * 

£> = °- (12) 

Let us treat the linear operator djdq according to the general theory 
of linear operators of § 7. We should then be able to apply it to a bra 
<</>(#), the product (<f>d/dq being defined, according to (3) of § 7, by 

<*!)*>=<*&*> 


( 13 ) 
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for all functions ip(q). Taking representatives, we get 

J <k r m = J m dq' (i4) 

We can transform the right-hand side by partial integration and get 


' <*||g'> dq' m = - J dq' m> 


(15) 


provided the contributions from the limits of integration vanish. 

d<f>(q f ) 


This gives 
showing that 


d 


dq' 

^dq = 


(16) 


Thus djdq operating to the left on the conjugate complex of a wave 
function has the meaning of minus differentiation with respect to q. 

The validity of this result depends on our being able to make the 
passage from (14) to (15), which requires that we must restrict our¬ 
selves to bras and kets corresponding to wave functions that satisfy 
suitable boundary conditions. The conditions usually holding in 
practice are that they vanish at the boundaries. (Somewhat more 
general conditions will be given in the next section.) These conditions 
do not limit the physical applicability of the theory, but, on the con¬ 
trary, are usually required also on physical grounds. Tor example, 
if q is a Cartesian coordinate of a particle, its eigenvalues run from 
~co to oo, and the physical requirement that the particle has zero 
probability of being at infinity leads to the condition that the wave 
function vanishes for q = ± 00 . 

The conjugate complex of the linear operator djdq can be evaluated 
by noting that the conjugate imaginary of d/dq.f') or difjjdq f > is 
< 1 d$jdq, or — < $d/dq from (16). Thus the conjugate complex of djdq 
is —d/dq, so djdq is a pure imaginary linear operator. 


To get the representative of d/dq we note that, from an application 

of formula (63) of § 20, 



\a"> = s(<z-<n>, 

(17) 

so that 


(18) 

and hence 


(19) 


The representative of djdq involves the derivative of the 8 function. 
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Let us work out the commutation relation connecting djdq with q. 
We have j , j 

&> = %>-&>+*>■ m 

Since this holds for any ket 0), we have 


dq' 


d 1 
~ q dq L 


( 21 ) 


Comparing this result with (10), we see that — ihdjdq satisfies the 
same commutation relation with q that p does . 

To extend the foregoing work to the case of arbitrary n, we write 
the general ket as \js{q x and introduce the n linear opera¬ 
tors d/dq r (r = l,...,w), which can operate on it in accordance with 
the formula ^ 0 , 

£*>-&. m 

Hr 8q r 

corresponding to (11). We have 

JL> _ 0 (23, 

corresponding to (12). Provided we restrict ourselves to bras and 
kets corresponding to wave functions satisfying suitable boundary 
conditions, these linear operators can operate also on bras, in accor¬ 
dance with the formula 


<$— = 
9 Hr 


■< 


8 $_ 

Hr’ 


(24) 


corresponding to (16). Thus d/dq r can operate to the left on the 
conjugate complex of a wave function, when it has the meaning of 
minus partial differentiation with respect to q r . We find as before 
that each djdq r is a pure imaginary linear operator. Corresponding 
to (21) we have the commutation relations 


We have further 


e d 


JL 

Hr 

*> = 


5 

Hr Hi Hs Hr 


(25) 


showing that 


(26) 

(27) 


Hr Hs 

d d _ d d_ 

HrH S ~HsHr 

Comparing (25) and (27) with (9), we see that the linear operators 
—ihdjdq r satisfy the same comm%itation relations with the q’s and with 
each other that the p's do. 
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It would be possible to take 

p r = - ibdjdq r (28) 

without getting any inconsistency. This possibility enables us to see 
that the q ’s must form a complete commuting set of observables, 
since it means that any function of the q "s and p's could be taken 
to be a function of the g’s and —vti djdq’ s and then could not commute 
with all the q ’s unless it is a function of the q' s only. 

The equations (28) do not necessarily hold. But in any case the 
quantities p r +ih d/dq r each commute with all the q’ s, so each of them 
is a function of the q’ s, from Theorem 2 of § 19. Thus 

Pr = -MZ/fyr+fM)- (29) 

Since p r and —iHdjdq r are both real, f r (q) must be real. For any 
function / of the ^’s we have 


showing that 


8q r J J 8q r dq/ 


(30) 


With the help of (29) we can now deduce the general formula 

Prf-fPr = —i* VfltoLr- (31) 

This formula may be written in P.B. notation 

If’Pr] = df/dq r9 (32) 

when it is the same as in the classical theory, as follows from (I). 


Multiplying (27) by (—iftf and substituting for —ih d/dq r and — djdq s 
their values given by (29), we get 


(Pr-fr)(Ps-f S ) = (Ps~fs)(Pr~fr), 

which reduces, with the help of the quantum condition p r p s — p a p T , to 

Prfs~\~frPs = Psfr~^~f sPt‘ 

This reduces further, with the help of (31), to 


8 fJSq r = Sf r /8q a> (33) 

showing that the functions f r are all of the form 

fr = 8Fjdq r (34) 

with F independent of r. Equation (29) now becomes 


Pr = —ifi8l8q r +8F!8q r . (35) 

We have been working with a representation which is fixed to the 
extent that the q s must be diagonal in it, but which contains arbitrary 
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phase factors. If the phase factors are changed, the operators d/dq r 
get changed. It will now be shown that, by a suitable change in the 
phase factors, the function F in (35) can be made to vanish, so that 
equations (28) are made to hold. 

Using stars to distinguish quantities referring to the new repre¬ 
sentation with the new phase factors, we shall have the new basic 
bras connected with the previous ones by 

<?;-<?;*! = ( 36 ) 

where y — y(q r ) is a real function of the q n s. The new representa¬ 
tive of a ket is e { y' times the old one, showing that e^>* = so 

we set >* = e-<0 (37) 

as the connexion between the new standard ket and the original one. 
The new linear operator (d/dq r )* satisfies, corresponding to (22), 


)V 


dip 

Hr 


>* 


e-ir 8 JLy 

Hr 


with the help of (37). Using (22), this gives 



= e-tv—eiYiby*, 

Hr Hr 


showing that 

\8, 

3 \* . 8 . 

= e~ l y — e x v, 

lr! Hr 

(38) 

or, with the help of (30), 

a 

3\* 8 .8y 

lr) ~HrHr' 

(39) 

By choosing y so that 

F — a constant, 

(40) 

(35) becomes 

Pr = -iti(dldq r )*. 

(41) 


Equation (40) fixes y except for an arbitrary constant, so the repre¬ 
sentation is fixed except for an arbitrary constant phase factor. 

In this way we see that a representation can be set up in which 
the q’s are diagonal and equations (28) hold. This representation is 
a very useful one for many problems. It will be called SchrodingeFs 
representation , as it was the representation in terms of which Schro- 
dinger gave his original formulation of quantum mechanics in 1926. 
Schrodinger’s representation exists whenever one has canonical g’s 
and^’s, and is completely determined by these q’s and^p’s except for 
an arbitrary constant phase factor. It owes its great convenience to 
its allowing one to express immediately any algebraic function of the 
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g’s and p’s of the form of a power series in the p’s as an operator of 
differentiation, e.g. Pv-,Pn) is such a function, we have 

f(qi,-,q«,Pv->Pn) = -ihdldq n ), (42) 

provided we preserve the order of the factors in a product on substi¬ 
tuting the —ihd/dq's, for the p’s. 

From (23) and (28), we have 

Pr> = °* ( 43 ) 

Thus the standard ket in Schrodinger’s representation is characterized 
by the condition that it is a simultaneous eigenket of all the momenta 
belonging to the eigenvalues zero. Some properties of the basic 
vectors of Schrodinger’s representation may also be noted. Equation 
(22) gives 



£1* 

NX 

II 


__ 8 /' I ;\ 

dq r - dq Mv^P)- 


Hence 




(44) 

so that 


Xs 

IP. 

3 ■* 

“■i 

II 


(45) 

Similarly, 

equation 

(24) leads to 





■s 

kq" 

ik 

\x 

II 

fo-jr? 1 fli-S»>- 

dq r 

(46) 


23. The momentum representation 

Let us take a system with one degree of freedom, describable in 
terms of a q and_p with the eigenvalues of q running from —oo to oo, 
and let us take an eigenket \p f y of p. Its representative in the Schro- 
dinger representation, <g'| p'}, satisfies 

p'<tf\p'> = <«'|p|p'> = — ^i~<^\p , 'y, 

with the help of (45) applied to the case of one degree of freedom. 
The solution of this differential equation for (q r \p f y is 

<gV> = c'e^W, (47) 

where d = c(p’) is independent of q f , but may involve p r . 

The representative <g' | p'} does not satisfy the boundary conditions 
of vanishing at q* = ±oo. This gives rise to some difficulty, which 
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shows itself up most directly in the failure of the orthogonality 
theorem. If we take a second eigenket \p") of p with representative 

<2'|p"> = 

belonging to a different eigenvalue p", we shall have 

00 CO 

(P'\p"y = J dq f (q'\p")> = c'c" J ctq\ (48) 

—00 —00 

This integral does not converge according to the usual definition of 
convergence. To bring the theory into order, we adopt a new defini¬ 
tion of convergence of an integral whose domain extends to infinity, 
analogous to the Cesaro definition of the sum of an infinite series. 
With this new definition, an integral whose value to the upper limit 
q f is of the form cosag' or smug', with a a real number not zero, is 
counted as zero when q' tends to infinity, i.e. we take the mean value 
of the oscillations, and similarly for the lower limit of q f tending to 
minus infinity. This makes the right-hand side of (48) vanish for 
p" p' } so that the orthogonality theorem is restored. Also it makes 
the right-hand sides of (13) and (14) equal when <</> and are eigen¬ 
vectors of p , so that eigenvectors of p become permissible vectors to 
use with the operator djdq. Thus the boundary conditions that the 
representative of a permissible bra or ket has to satisfy become 
extended to allow the representative to oscillate like cos aq' or sinag' 
as q r goes to infinity or minus infinity. 

For p" very close to p\ the right-hand side of (48) involves a S 
function. To evaluate it, we need the formula 

J e iax dx — 27r 8(a) (49) 

for real a , which may be proved as follows. The formula evidently 
holds for a different from zero, as both sides are then zero. Further 
we have, for any continuous function/(a), 

co g oo 

J f(a) da J e iax dx = j /(a) da 2a~ 1 sinag = 2 tt/(0) 

-fir 

in the limit when g tends to infinity. A more complicated argument 
shows that we get the same result if instead of the limits g and —g 
we put g x and — g 2i and then let g x and g 2 tend to infinity in different 
ways (not too widely different). This shows the equivalence of both 
sides of (49) as factors in an integrand, which proves the formula. 
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With the help of (49), (48) becomes 

<pV> - ?c' 2 flr 8 [(p'-p*)/«] = 7c"h§(p'-p") 

= \c'\ 2 h?>(p'—p"). (50) 

We have obtained an eigenket of p belonging to any real eigenvalue 
p\ its representative being given by (47). Any ket |X> can be ex¬ 
panded in terms of these eigenkets of p , since its representative 
<g' |X> can be expanded in terms of the representatives (47) by 
Fourier analysis. It follows that the momentum p is an observable , 
in agreement with the experimental result that momenta can be 
observed. 

A symmetry now appears between q and p. Each of them is an 
observable with eigenvalues extending from — oo to oo, and the 
commutation relation connecting q and p , equation ( 10 ), remains 
invariant if we interchange q and p and write —i for i. We have set 
up a representation in which q is diagonal and p = —iUdjdq. It 
follows from the symmetry that we can also set up a representation 
in which p is diagonal and 

q — iltd/dp , (51) 

the operator djdp being defined by a procedure similar to that used 
for dldq. This representation will be called the momentum representa¬ 
tion. It is less useful than the previous Schrodinger representation 
because, while the Schrodinger representation enables one to express 
as an operator of differentiation any function of q and p that is a 
power series in p, the momentum representation enables one so to 
express any function of q and p that is a power series in g, and the 
important quantities in dynamics are almost always power series in 
p but are often not power series in q. All the same the momentum 
representation is of value for certain problems (see § 50). 

Let us calculate the transformation function <g'| p'y connecting the 
two representations. The basic kets | p'y of the momentum re presenta¬ 
tion are eigenkets of p and their Schrodinger representatives <g'| p'y 
are given by (47) with the coefficients c' suitably chosen. The phase 
factors of these basic kets must be chosen so as to make (51) hold. 
The easiest way to bring in this condition is to use the symmetry 
between q and 3 ? referred to above, according to which <g'|#'> must 
go over into <y |g'> if we interchange g' and p ' and write —i for i . 
Now <g'|p'> is equal to the right-hand side of ( 47 ) and <p'\q *> to the 
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conjugate complex expression, and hence d must be independent of 
p'. Thus d is just a number c. Further, we must have 

<2>'b'> = 

which shows, on comparison with (50), that \c\ = We can choose 
the arbitrary constant phase factor in either representation so as to 
make c = and we then get 

<j'|p'> == (52) 

for the transformation function. 

The foregoing work may easily be generalized to a system with 
n degrees of freedom, describable in terms of n q 9 s and p 9 s, with the 
eigenvalues of each q running from — go to oo. Each p will then be 
an observable with eigenvalues running from —oo to oo, and there 
will be symmetry between the set of q 9 s and the set of p’s, the 
commutation relations remaining invariant if we interchange each q 7 
with the corresponding p r and write —i for i. A momentum repre¬ 
sentation can be set up in which the p’s are diagonal and each 

q r = ihd/dp r . (53) 

The transformation function connecting it with the Schrodinger 
representation will be given by the product of the transformation 
functions for each degree of freedom separately, as is shown by 
formula (67) of § 20, and will thus be 

<q.Wz---q'n\P'l'Pl--P'n> = <2ili>i><S'2l^2>-<2»lK> 

= h~ nl ‘ (54) 


24. Heisenberg’s principle of uncertainty 
For a system with one degree of freedom, the Schrodinger and the 
momentum representatives of a ket |X> are connected by 


<y|X> = h-i j e-KVIZdq' <q'\X), 

—CO 

OO 

(s' |X> = h-i J dp’ <p'|Z>. 


(55) 


These formulas have an elementary significance. They show that 
either of the representatives is given , apart from numerical coefficients , 
by the amplitudes of the Fourier components of the other . 

It is interesting to apply (55) to a ket whose Schrodinger repre¬ 
sentative consists of what is called a wave packet . This is a function 

3595-57 
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whose value is very small everywhere outside a certain domain, of 
width A q' say, and inside this domain is approximately periodic with 
a definite frequency.| If a Fourier analysis is made of such a wave 
packet, the amplitude of all the Fourier components will be small, 
except those in the neighbourhood of the definite frequency. The 
components whose amplitudes are not small will fill up a frequency^ 
band whose width is of the order 1/A q\ since two components whose 
frequencies differ by this amount, if in phase in the middle of the 
domain A q', will be just out of phase and interfering at the ends of 
this domain. Now in the first of equations (55) the variable 
(2tt)~ 1 jp' jh — p'jh plays the part of frequency. Thus with < q' |X> of the 
form of a wave packet, the function <p'|X>, being composed of the 
amplitudes of the Fourier components of the wave packet, will be 
small everywhere in the p'-space outside a certain domain of width 
Ap' =* A/A®'. 

Let us now apply the physical interpretation of the square of the 
modulus of the representative of a ket as a probability. We find that 
our wave packet represents a state for which a measurement of q is 
almost certain to lead to a result lying in a domain of width A q' and 
a measurement of p is almost certain to lead to a result lying in a 
domain of width Ap'. We may say that for this state q has a definite 
value with an error of order A q' and p has a definite value with an 
error of order Ap'. The product of these two errors is 

Aq'Ap' = h. ’ (56) 

Thus the more accurately one of the variables q,p has a definite 
value, the less accurately the other has a definite value. For a system 
with several degrees of freedom, equation (56) applies to each degree 
of freedom separately. 

Equation (56) is known as Heisenberg's Principle of Uncertainty. 
It shows clearly the limitations in the possibility of simultaneously 
assigning numerical values, for any particular state, to two non¬ 
commuting observables, when those observables are a canonical co¬ 
ordinate and momentum, and provides a plain illustration of how 
observations in quantum mechanics may be incompatible. It also 
shows how classical mechanics, which assumes that numerical values 
can be assigned simultaneously to all observables, may be a valid 
approximation when h can be considered as small enough to be 

t Frequency here means reciprocal of wave-length. 
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negligible. Equation (56) holds only in the most favourable case, 
which occurs when the representative of the state is of the form of a 
wave packet. Other forms of representative would lead to a Aq' and 
Ap' whose product is larger than h. 

Heisenberg’s principle of uncertainty shows that, in the limit when 
either q or p is completely determined, the other is completely 
undetermined. This result can also be obtained directly from the 
transformation function (q'\p'}- According to the end of § 18, 
\<q'\p'>\ 2 dq' is proportional to the probability of q having a value in 
the small range from q' to q'+dq' for the state for which p certainly 
has the value p\ and from (52) this probability is independent of q f 
for a given dq'. Thus if p certainly has a definite value p\ all values 
of q are equally probable. Similarly, if q certainly has a definite value 
q\ all values of p are equally probable. 

It is evident physically that a state for which all values of q are 
equally probable, or one for which all values of p are equally probable, 
cannot be attained in practice, in the first case because of limitations 
of size and in the second because of limitations of energy. Thus an 
eigenstate of p or an eigenstate of q cannot be attained in practice. 
The argument at the end of § 12 already showed that such eigenstates 
are unattainable, because of the infinite precision that would be 
needed to set them up, and we now have another argument leading 
to the same conclusion. 

25. Displacement operators 

We get a new insight into the meaning of some of the quantum con- 
ditions by making a study of displacement operators. These appear 
in the theory when we take into consideration that the scheme of 
relations between states and dynamical variables given in Chapter II 
is essentially a physical scheme, so that if certain states and dynamical 
variables are connected by some relation, on our displacing them all 
in a definite way (for example, displacing them all through a distance 
8x in the direction of the #-axis of Cartesian coordinates), the new 
states and dynamical variables would have to be connected by the 
same relation. 

The displacement of a state or observable is a perfectly definite 
process physically. Thus to displace a state or observable through a 
distance Sx in the direction of the #-axis, we should merely have to 
displace all the apparatus used in preparing the state, or all the 
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apparatus required to measure the observable, through the distance 

in the direction of the £~axis, and the displaced apparatus would 
define the displaced state or observable. The displacement of a 
dynamical variable must be just as definite as the displacement of 
an observable, because of the close mathematical connexion between 
dynamical variables and observables. A displaced state or dynamical 
variable is uniquely determined by the undisplaced state or dynami¬ 
cal variable together with the direction and magnitude of the dis¬ 
placement. 

The displacement of a ket vector is not such a definite thing though. 
If we take a certain ket vector, it will represent a certain state and we 
may displace this state and get a perfectly definite new state, but this 
new state will not determine our displaced ket, but only the direction 
of our displaced ket. We help to fix our displaced ket by requiring 
that it shall have the same length as the undisplaced ket, but even 
then it is not completely determined, but can still be multiplied by 
an arbitrary phase factor. One would think at first sight that each 
ket one displaces would have a different arbitrary phase factor, 
but with the help of the following argument, we see that it must be 
the same for them all. We make use of the law that superposition 
relationships between states remain invariant under the displace¬ 
ment. A superposition relationship between states is expressed 
mathematically by a linear equation between the kets corresponding 
to those states, for example 

|i?> = c 1 |A)+c 2 |.B>, (57) 

where c x and c 2 are numbers, and the invariance of the superposition 
relationship requires that the displaced states correspond to kets 
with the same linear equation between them—in our example they 
would correspond to |iM>, | Ad') i | Bd} say, satisfying 

| JRdy = Cj|A($^-bc 2 | (58) 

We take these kets to be our displaced kets, rather than these kets 
multiplied by arbitrary independent phase factors, which latter 
kets would satisfy a linear equation with different coefficients c v c 2 . 
The only arbitrariness now left in the displaced kets is that of a single 
arbitrary phase factor to be multiplied into all of them. 

The condition that linear equations between the kets remain in¬ 
variant under the displacement and that an equation such as (58) 
holds whenever the corresponding (57) holds, means that the dis- 
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placed kets are linear functions of the undisplaced kets and thus each 
displaced ket | Pdy is the result of some linear operator applied to the 
corresponding undisplaced ket |P>. In symbols, 

\Pd) = D\Py, ( 59 ) 

where D is a linear operator independent of |P> and depending only 
on the displacement. The arbitrary phase factor by which all the 
displaced kets may be multiplied results in D being undetermined 
to the extent of an arbitrary numerical factor of modulus unity. 

With the displacement of kets made definite in the above manner 
and the displacement of bras, of course, made equally definite, 
through their being the conjugate imaginaries of the kets, we can 
now assert that any symbolic equation between kets, bras, and 
dynamical variables must remain invariant under the displacement 
of every symbol occurring in it, on account of such an equation 
having some physical significance which will not get changed by the 
displacement. 

Take as an example the equation 

<Q\py = c, 

c being a number. Then we must have 

<Qd\Pd> = c = <Q|P>. (60) 

From the conjugate imaginary of (59) with Q instead of P, 

<Qd\ = <Q\B. (61) 

Hence (60) gives (Q\DD\Py = <<9|P>. 

Since this holds for arbitrary (Q | and |P>, we must have 

DD = 1, (62) 

giving us a general condition which D has to satisfy. 

Take as a second example the equation 

v\py = |P>, 

where v is any dynamical variable. Then, using v d to denote the 
displaced dynamical variable, we must have 

v d \ Pdy = \Rdy . 

With the help of (59) we get 

v d \Pdy = D\Ry = Dv\py = DvD-^Pdy. 

Since | Pdy can be any ket, we must have 

v d — DvD~\ 


(63) 
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which shows that the linear operator D determines the displacement 
of dynamical variables as well as that of kets and bras. Note that 
the arbitrary numerical factor of modulus unity in D does not affect 
v di and also it does not affect the validity of (62). 

Let us now pass to an infinitesimal displacement, i.e. taking the 
displacement through the distance Sx in the direction of the £-axis, 
let us make Sx 0. From physical continuity we should expect 
a displaced ket | Pd) to tend to the original |P> and we may further 
expect the limit 


lim 

OX 


= lim | P> 

8x~>0 OX 


to exist. This requires that the limit 


lim (D—1)/Sx (64) 

Sx~+o 

shall exist. This limit is a linear operator which we shall call the 
displacement operator for the ^-direction and denote by d x . The 
arbitrary numerical factor with y real which we may multiply 
into D must be made to tend to unity as Sx ~> 0 and then introduces 
an arbitrariness in d x > namely, d x may be replaced by 

Um(De l y—l)/Sx = lim (D— l~}~iy)/Sx = d x ~\~ia x , 

8x~ >-0 S%—*-0 

where a x is the limit of y/Sx. Thus d x contains an arbitrary additive 
pure imaginary number. 

For Sx small Z> = l+8xd x . (65) 

Substituting this into (62), we get 

(l+Sz^Xl+S*^) = 1, 
which reduces, with neglect of 8* 2 , to 

8 x(d x +d x ) = 0 . 

Thus d x is a pure imaginary linear operator. Substituting (65) into 
(63) we get, with neglect of 8a; 2 again, 

v a — (1+3* d x )v{l—Sxd x ) = v+Sx{d x v—v d x ), (66) 
showing that lim (v d —v)/Sx = d x v—vd x . (67) 

We may describe any dynamical system in terms of the following 
dynamical variables: the Cartesian coordinates x, y,z of the centre of 
mass of the system, the components p x ,p y ,p z of the total momentum 
of the system, which are the canonical momenta conjugate to x,y,z 
respectively, and any dynamical variables needed for describing 
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internal degrees of freedom of the system. If we suppose a piece 
of apparatus which has been set up to measure x, to be displaced a 
distance Sx in the direction of the z-axis, it will measure x—Sx, hence 

x d = x—Sx. 

Comparing this with (66) for v — x, we obtain 

d x z—xd x =—l. (68) 


This is the quantum condition connecting d x with x. From similar 
arguments we find that y, z,p x ,p y , p z and the internal dynamical vari¬ 
ables, which are unaffected by the displacement, must commute with 
d x . Comparing these results with (9), we see that ihd x satisfies just 
the same quantum conditions as p x . Their difference, p x —ihd xi 
commutes with all the dynamical variables and must therefore be a 
number. This number, which is necessarily real since p x and ih d x are 
both real, may be made zero by a suitable choice of the arbitrary, 
pure imaginary number that can be added to d x . We then have the 


result 


Px = Mdx, 


(69) 


or the x~component of the total momentum of the system is ih times the 
displacement operator d x . 

This is a fundamental result, which gives a new significance to 
displacement operators. There is a corresponding result, of course, 
also for the y and z displacement operators d y and d z . The quantum 
conditions which state that p x , p y and p z commute with each other 
are now seen to be connected with the fact that displacements in 
different directions are commutable operations. 


26. Unitary transformations 

Let U be any linear operator that has a reciprocal U" 1 and con¬ 
sider the equation __ i ( 70 ) 

a being an arbitrary linear operator. This equation may be regarded 
as expressing a transformation from any linear operator a to a 
corresponding linear operator a*, and as such it has rather remarkable 
properties. In the first place it should be noted that each a* has the 
same eigenvalues as the corresponding a; since, if a! is any eigenvalue 
of a and |a'> is an eigenket belonging to it, we have 

ala') = oe!\(x)> 

and hence 


a*u |oO = UaU-wwy = U(x\cc'y = oc'U\oc'y, 
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showing that E7|a'> is an eigenket of a* belonging to the same eigen¬ 
value a , and similarly any eigenvalue of a* may be shown to be also 
an eigenvalue of a. Further, if we take several a s that are connected 
by algebraic equations and transform them all according to (70), the 
corresponding a* 5 s will be connected by the same algebraic equations. 
This result follows from the fact that the fundamental algebraic pro¬ 
cesses of addition and multiplication are left invariant by the trans¬ 
formation (70), as is shown by the following equations: 

(06,4-02)* - Ufa+ajU - 1 = U^U^+Ua.U^ - a* + cx*, 
(oqaa)* = E/a, a 2 U~ l = E/a, U~ l U<x % U" 1 = afaj. 

Let us now see what condition would be imposed on U by the 
requirement that any real a transforms into a real a*. Equation 
(70) may be written a * jj ^ jj^ (71) 

Taking the conjugate complex of both sides in accordance with 
(5) of § 8 we find, if a and a* are both real, 

Uol* = aXJ. (72) 

Equation (71) gives us Ua*TJ = UUol 
and equation (72) gives us 

Ua*U = ccUU. 

Hence UUol = <xUU. 

Thus UU commutes with any real linear operator and therefore also 
with any linear operator whatever, since any linear operator can be 
expressed as one real one plus i times another. Hence UU is a 
number. It is obviously real, its conjugate complex according to (5) 
of § 8 being the same as itself, and further it must be a positive 
number, since for any ket |P>, <P|E7E7|P> is positive as well as 
<P|P>. We can suppose it to be unity without any loss of generality 
in the transformation (70). We then have 

UU = 1. (73) 

Equation (73) is equivalent to any of the following 

U = U-\ U = E/- 1 , ^ i (74) 

A matrix or linear operator U that satisfies (73) and (74) is said 
to be unitary and a transformation (70) with unitary U is called a 
unitary transformation. A unitary transformation transforms real 
linear operators into real linear operators and leaves invariant any 
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algebraic equation between linear operators. It may be considered 
as applying also to kets and bras, in accordance with the equations 
|P*> = u\py, <P*I - <P\U = < P\U~\ (75) 

and then it leaves invariant any algebraic equation between linear 
operators, kets, and bras. It transforms eigenvectors of a into eigen¬ 
vectors of a*. From this one can easily deduce that it transforms an 
observable into an observable and that it leaves invariant any func¬ 
tional relation between observables based on the general definition 
of a function given in § 11. 

The inverse of a unitary transformation is also a unitary trans¬ 
formation, since from (74), if U is unitary, U~ l is also unitary. 
Further, if two unitary transformations are applied in succession, 
the result is a third unitary transformation, as may be verified in 
the following way. Let the two unitary transformations be (70) and 

at = Va*V~\ 

The connexion between a 1 ' and a is then 

at = VUolU-W - 1 

= (VU)oc(VU)- 1 (76) 

from (42) of § 11. Now VU is unitary since 

VUVU = UVVU = UU = 1, 

and hence (7 6) is a unitary transformation. 

The transformation given in the preceding section from undisplaced 
to displaced quantities is an example of a unitary transformation, as 
is shown by equations (62), (63), corresponding to equations (73), 
(70), and equations (59), 61), corresponding to equations (75). 

In classical mechanics one can make a transformation from the 
canonical coordinates and momenta q r ,p r (r = 1,.., n) to a new set of 
variables qf,pf (r = l,..,n) satisfying the same P.B. relations as the 
q 9 s and p’s, i.e. equations (8) of § 21 with #*’s and p*’s replacing the 
q* s andp’s, and can express all dynamical variables in terms of the g*’s 
and p*’s. The q *\s andp*’s are then also called canonical coordinates 
and momenta and the transformation is called a contact transforma¬ 
tion. One can easily verify that the P.B. of any two dynamical 
variables u and v is correctly given by formula (1) of § 21 with q * 9 s and 
p*’s instead of q 9 s andp’s, so that the P.B. relationship is invariant 
under a contact transformation. This results in the new canonical 
coordinates and momenta being on the same footing as the original 
ones for many purposes of general dynamical theory, even though the 
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new coordinates qf may not be a set of Lagrangian coordinates but 
may be functions of the Lagrangian coordinates and velocities. 

It will now be shown that, for a quantum dynamical system that 
has a classical analogue, unitary transformations in the quantum theory 
are the analogue of contact transformations in the classical theory. 
Unitary transformations are more general than contact transforma¬ 
tions, since the former can be applied to systems in quantum 
mechanics that have no classical analogue, but for those systems in 
quantum mechanics which are describable in terms of canonical 
coordinates and momenta, the analogy between the two kinds of 
transformation holds. To establish it, we note that a unitary trans¬ 
formation applied to the quantum variables q r ,p r gives new variables 
satisfying the same P.B. relations, since the P.B, relations are 
equivalent to the algebraic relations (9) of § 21 and algebraic relations 
are left invariant by a unitary transformation. Conversely, any real 
variables satisfying the P.B. relations for canonical coordinates 
and momenta are connected with the q r ,p r by a unitary transforma¬ 
tion, as is shown by the following argument. 

We use the Schrodinger representation, and write the basic ket 
1 as Iff') for brevity. Since we are assuming that the 
satisfy the P.B. relations for canonical coordinates and momenta, 
we can set up a Schrodinger representation referring to them, with 
the qf diagonal and each p* equal to —ihdjdqf. The basic kets in 
this second Schrodinger representation will be which we 

write \q*'y for brevity. Now introduce the linear operator U defined by 
<q*'\U\q'> = 8 (ff*'-ff')> (77) 

where S (q*'—q') is short for 

= s(ffi*-ffi)S(ffr-ff;)...8( 3 r-~g;). (7 8) 

The conjugate complex of (77) is 

<q'\U\q*'>^8(q*'~q% 

and hencef 

<q’\UU\q"y = J (q'\U\q*'y dq*' <q*'\U\q"y 
= f 8(9*'— q') dq*' 8 {q*'—q") 

so that UU = 1. 

t We use the notation of a single integral sign and dq*' to denote an integral over 
all the variables »7* »•**> • This abbreviation will be used also in future work. 
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Thus U is a unitary operator. We have further 

and <?*W&¥> = %*W)?r- 

The right-hand sides of these two equations are equal on account of 

the property of the S function (11) of § 15, and hence 

f r U=Vq r 

or qf = Uq r U~ l . 

Again, from (45) and (46), 

- -« isfr-j'), 

8q r 

The right-hand sides of these two equations are obviously equal, and 

he ” ce f r V = Up, 

or p* = Up r U’ 1 . 

Thus all the conditions for a unitary transformation are verified. 

We get an infinitesimal unitary transformation by taking U in (70) 
to differ by an infinitesimal from unity. Put 

U = 1 -\-ieF, 

where e is infinitesimal, so that its square can be neglected. Then 

U~' = 1-UF. 


The unitary condition (73) or (74) requires that F shall be real. The 
transformation equation (70) now takes the form 

ct* = (l-j-ieJ?’)a(l— ieF), 

which gives a = ie(Fa-aF). (79) 

It may be written in P.B. notation 

a*—a = e^[a, F]. ( 80 ) 

If a is a canonical coordinate or momentum, this is formally the same 
as a classical infinitesimal contact transformation. 



V 

THE EQUATIONS OF MOTION 

27. Schrodinger’s form for the equations of motion 

Our work from § 5 onwards has all been concerned with one instant 
of time. It gave the general scheme of relations between states and 
dynamical variables for a dynamical system at one instant of time. 
To get a complete theory of dynamics we must consider also the 
connexion between different instants of time. When one makes an 
observation on the dynamical system, the state of the system gets 
changed in an unpredictable way, but in between observations 
causality applies, in quantum mechanics as in classical mechanics, 
and the system is governed by equations of motion which make the 
state at one time determine the state at a later time. These equations 
of motion we now proceed to study. They will apply so long as the 
dynamical system is left undisturbed by any observation or similar 
process. - ]- Their general form can be deduced from the principle of 
superposition of Chapter I. 

Let us consider a particular state of motion throughout the time 
during which the system is left undisturbed. We shall have the state 
at any time t corresponding to a certain ket which depends on t and 
which may be written |t>. If we deal with several of these states of 
motion we distinguish them by giving them labels such as A, and we 
then write the ket which corresponds to the state at time t for one 
of them | At}. The requirement that the state at one time determines 
the state at another time means that |Af 0 > determines | At) except 
for a numerical factor. The principle of superposition applies to these 
states of motion throughout the time during which the system is 
undisturbed, and means that if we take a superposition relation 
holding for certain states at time t 0 and giving rise to a linear equation 
between the corresponding kets, e.g. the equation 

I A,) — c i|Hi 0 )+c 2 |Si 0 ), 

the same superposition relation must hold between the states of 
motion throughout the time during which the system is undisturbed 
and must lead to the same equation between the kets corresponding 

f The preparation of a state is a process of this kind. It often takes the form of 
making an observation and selecting the system when the result of the observation 
turns out to be a certain pre-assigned number. 
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to these states at any time t (in the undisturbed time interval), i.e. 
the equation |.»> = Cl \Aty+c z \Bt), 


provided the arbitrary numerical factors by which these kets may be 
multiplied are suitably chosen. It follows that the |PZ) J s are linear 
functions of the |P£ 0 >’s and each \Pt) is the result of some linear 
operator applied to \Pt 0 }. In symbols 

I Pt> = T\Pt 0 y, (i) 

where T is a linear operator independent of P and depending only 
on t (and t 0 ). 

We now assume that each \Pty has the same length as the corre¬ 
sponding \Pt Q >. It is not necessarily possible to choose the arbitrary 
numerical factors by which the |PZ>’s may be multiplied so as to 
make this so without destroying the linear dependence of the \Pt}'s 
on the |P£ 0 >’s, 80 the new assumption is a physical one and not just 
a question of notation. It involves a kind of sharpening of the 
principle of superposition. The arbitrariness in \Pt} now becomes 
merely a phase factor, which must be independent of P in order that 
the linear dependence of the |Ptf)’s on the |PZ 0 > J s may be preserved. 
Prom the condition that the length of c l \Pt'y-\-c <l \Qt') equals that of 
Ci | Pt^y -f- c% | Qt^y for any complex numbers c l5 c 2 , we can deduce that 

<Qt\P*> = <C*o|P*o>- (2) 

The connexion between the |P£> J s and |P£ 0 >’s is formally similar 
to the connexion we had in§ 25 between the displaced and undisplaced 
kets, with a process of time displacement instead of the space displace¬ 
ment of § 25. Equations (1) and (2) play the part of equations (59) 
and (60) of § 25. We can develop the consequences of these equations 
as in § 25 and can deduce that T contains an arbitrary numerical 
factor of modulus unity and satisfies 


PP= 1, (3) 

corresponding to (62) of § 25, so T is unitary . We pass to the infinitesi¬ 
mal case by making t -> t 0 and assume from physical continuity that 


the limit 


lim 

t—>to 


\pt> \pto> 

t t n 


exists. This limit is 
Prom (1) it equals 


just the derivative of |P£ 0 > with respect to t Q . 


t t £ 


I Pto>- 


(4) 
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The limi t, operator occurring here is, like (64) off 25, a pure imaginary 
linear operator and is undetermined to the extent of an arbitrary 
additive pure imaginary number. Putting this limit operator multi¬ 
plied by ifi equal to H, or rather H(t 0 ) since it may depend on t 0 , 
equation (4) becomes, when written for a general t, 

= (5) 

Equation ( 5 ) gives the general law for the variation with time of 
the ket corresponding to the state at any time. It is Schrodinger\s 
form for the equations of motion. It involves just one real linear 
operator H(t ), which must be characteristic of the dynamical system 
under consideration. We assume that H(t) is the total energy of 
the system. There are two justifications for this assumption, (i) the 
analogy with classical mechanics, which will be developed in the 
next section, and (ii) we have H(t) appearing as ih times an operator 
of displacement in time similar to the operators of displacement in 
the x, y , and % directions of § 25, so corresponding to (69) of § 25 
we should have H(t) equal to the total energy, since the theory of 
relativity puts energy in the same relation to time as momentum to 
distance. 

We assume on physical grounds that the total energy of a system 
is always an observable. For an isolated system it is a constant, and 
may then be written H. Even when it is not a constant we shall often 
write it simply H, leaving its dependence on t understood. If the 
energy depends on t, it means the system is acted on by external 
forces. An action of this kind is to be distinguished from a distur¬ 
bance caused by a process of observation, as the former is compatible 
with causality and equations of motion while the latter is not. 

We can get a connexion between H(t) and the T of equation ( 1 ) 
by substituting for \Pt) in (5) its value given by equation ( 1 ). This 
gives dT 

ir ^\ Pt o> = m)T\Ptf>- 

Since | Ptf) may be any ket, we have 

rIT 

ih°± = H(t)T. (6) 

Equation (5) is very important for practical problems, where it is 
usually used in conjunction with a representation. Introducing a 
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representation with a complete set of commuting observables £ 
diagonal and putting <£'| Pt} equal to i/r(£'£), we have, passing to the 
standard ket notation, , r>, v 

\rty = </*($)>. 

Equation (5) now becomes 


«^(0)> = ^W)>- (7) 

Equation (7) is known as Schrodinger's wave equation and its solutions 
j p(£t) are time-dependent wave functions. Each solution corresponds to 
a state of motion of the system and the square of its modulus gives 
the probability of the |’s having specified values at any time t. For 
a system describable in terms of canonical coordinates and momenta 
we may use Schrodinger’s representation and can then take E to be 
an operator of differentiation in accordance with (42) of § 22. 


28. Heisenberg’s form for the equations of motion 

In the preceding section we set up a picture of the states of 
undisturbed motion by making each of them correspond to a moving 
ket, the state at any time corresponding to the ket at that time. We 
shall call this the Schrodinger picture . Let us apply to our kets the 
unitary transformation which makes each ket |a> go over into 

| a*> = (8) 

This transformation is of the form given by (75) of § 26 with T~ x for 
U, but it depends on the time t since T depends on t. It is thus to be 
pictured as the application of a continuous motion (consisting of 
rotations and uniform deformations) to the whole ket vector space. 
A ket which is originally fixed becomes a moving one, its motion being 
given by (8) with |a) independent of t. On the other hand, a ket 
which is originally moving to correspond to a state of undisturbed 
motion, i.e. in accordance with equation (1), becomes fixed; since on 
substituting | Pt} for |a> in (8) we get |a*> independent of t. Thus 
the transformation brings the kets corresponding to states of undisturbed 
motion to rest. 

The unitary transformation must be applied also to bras and linear 
operators, in order that equations between the various quantities may 
remain invariant. The transformation applied to bras is given by the 
conjugate imaginary of (8) and applied to linear operators it is given 
by (70) of § 26 with T- 1 for U , i.e. 

= T~ x olT. 


(9) 
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A linear operator which is originally fixed transforms into a moving 
linear operator in general. Now a dynamical variable corresponds to 
a linear operator which is originally fixed (because it does not refer 
to t at all), so after the transformation it corresponds to a moving 
linear operator. The transformation thus leads us to a new picture 
of the motion, in which the states correspond to fixed vectors and 
the dynamical variables to moving linear operators. We shall call 
this the Heisenberg picture^ 

The physical condition of the dynamical system at any time 
involves the relation of the dynamical variables to the state, and 
the change of the physical condition with time may be ascribed 
either to a change in the state, with the dynamical variables kept 
fixed, which gives us the Schrodinger picture, or to a change in the 
dynamical variables, with the state kept fixed, which gives us the 
Heisenberg picture. 

In the Heisenberg picture there are equations of motion for the 
dynamical variables. Take a dynamical variable corresponding to 
the fixed linear operator v in the Schrodinger picture. In the Heisen¬ 
berg picture it corresponds to a moving linear operator, which we 
write as v t instead of v *, to bring out its dependence on t , and which 
is given by u = T ~h>T (10) 


Differentiating with respect to t, we get 

dT m dv, c 

dt 1 dt 

With the help of (6), this gives 


HTvt+MT-^ = vHT 
at 

ih~ = T-'vHT-T-mTv, 
dt 1 

= v t H t -H t v t , 

H, = T-WT. 


where 


Equation (11) may be written in P.B. notation 
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Equation (11) or (13) shows how any dynamical variable varies 
with time in the Heisenberg picture and gives ns Heisenberg's form 
for the equations of motion. These equations of motion are determined 
by the one linear operator H t , which is just the transform of the linear 
operator H occurring in Schrodinger’s form for the equations of 
motion and corresponds to the energy in the Heisenberg picture. We 
shall call the dynamical variables in the Heisenberg picture, where 
they vary with the time, Heisenberg dynamical variables , to distinguish 
them from the fixed dynamical variables of the Schrodinger picture, 
which we shall call Schrodinger dynamical variables . Each Heisenberg 
dynamical variable is connected with the corresponding Schrodinger 
dynamical variable by equation (10). Since this connexion is a unitary 
transformation, all algebraic and functional relationships are the 
same for both kinds of dynamical variable. We have T = 1 for 
t = t Q) so that v io = v and any Heisenberg dynamical variable at time 
t 0 equals the corresponding Schrodinger dynamical variable. 

Equation (13) can be compared with classical mechanics, where we 
also have dynamical variables varying with the time. The equations 
of motion of classical mechanics can be written in the Hamiltonian 

form dq, 8H dp, 8H 

dt dp/ dt 'dq/ 1 } 

where the q's and p’s are a set of canonical coordinates and momenta 
and H is the energy expressed as a function of them and possibly also 
of t. The energy expressed in this way is called the Hamiltonian. 
Equations (14) give, for v any function of the q’ s and p’s that does 
not contain the time t explicitly, 

dv __ V / dq rjL ^ V dP r \ 
dt 2, [dq T dt 8p r dt ) 

8v 8H 8v 8H 
8q r dp T ~ 8p r 8q r 

v = (15) 

with the classical definition of a P.B., equation (1) of § 21. This is 
of the same form as equation (13) in the quantum theory. We thus 
get an analogy between the classical equations of motion in the 
Hamiltonian form and the quantum equations of motion in Heisen¬ 
berg’s form. This analogy provides a justification for the assumption 

3595.57 
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that the linear operator H introduced in the preceding section is the 
energy of the system in quantum mechanics. 

In classical mechanics a dynamical system is defined mathemati¬ 
cally when the Hamiltonian is given, i.e. when the energy is given 
in terms of a set of canonical coordinates and momenta, as this is 
sufficient to fix the equations of motion. In quantum mechanics a 
dynamical system is defined mathematically when the energy is 
given in terms of dynamical variables whose commutation relations 
are known, as this is then sufficient to fix the equations of motion, 
in both Schrodinger’s and Heisenberg’s form. We need to have 
either H expressed in terms of the Schrodinger dynamical variables 
or H t expressed in terms of the corresponding Heisenberg dynamical 
variables, the functional relationship being, of course, the same in 
both cases. We call the energy expressed in this way the Hamiltonian 
of the dynamical system in quantum mechanics, to keep up the 
analogy with the classical theory. 

A system in quantum mechanics always has a Hamiltonian, whether 
the system is one that has a classical analogue and is describable in 
terms of canonical coordinates and momenta or not. However, if the 
system does have a classical analogue, its connexion with classical 
mechanics is specially close and one can usually assume that the 
Hamiltonian is the same function of the canonical coordinates and 
momenta in the quantum theory as in the classical theory/)* There 
would be a difficulty in this, of course, if the classical Hamiltonian 
involved a product of factors whose quantum analogues do not com¬ 
mute, as one would not know in which order to put these factors in 
the quantum Hamiltonian, but this does not happen for most of the 
elementary dynamical systems whose study is important for atomic 
physics. In consequence we are able also largely to use the same 
language for describing dynamical systems in the quantum theory as 
in the classical theory (e.g. to talk about particles with given masses 
moving through given fields of force), and when given a system in 
classical mechanics, can usually give a meaning to ‘the same’ system 
in quantum mechanics. 

Equation (13) holds for v t any function of the Heisenberg dynamical 
variables not involving the time explicitly, i.e. for v any constant 

f This assumption is found in practice to be successful only when applied with the 
dynamical coordinates and momenta referring to a Cartesian system of axes and not 
to more general curvilinear coordinates. 
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linear operator in the Schrodinger picture. It shows that such a 
function v t is constant if it commutes with H t or if v commutes with H. 


We then have 


v t = % = 


and we call v t or v a constant of the motion . It is necessary that v shall 
commute with H at all times, which is usually possible only if H is 
constant. In this case we can substitute H for v in (13) and deduce 
that H t is constant, showing that H itself is then a constant of the 
motion. Thus if the Hamiltonian is constant in the Schrodinger 
picture, it is also constant in the Heisenberg picture. 

For an isolated system, a system not acted on by any external 
forces, there are always certain constants of the motion. One of these 
is the total energy or Hamiltonian. Others are provided by the 
displacement theory of § 25. It is evident physically that the total 
energy must remain unchanged if all the dynamical variables are 
displaced in a certain way, so equation (63) of § 25 must hold with 
v d = v = H. Thus D commutes with H and is a constant of the 
motion. Passing to the case of an infinitesimal displacement, we see 
that the displacement operators d x , d y , and d z are constants of the 
motion and hence, from (69) of § 25, the total momentum is a constant 
of the motion. Again, the total energy must remain unchanged if all 
the dynamical variables are subjected to a certain rotation. This 
leads, as will be shown in § 35, to the result that the total angular 
momentum is a constant of the motion. The laws of conservation of 
energy, momentum , and angular momentum hold for an isolated system 
in the Heisenberg picture in quantum mechanics , as they hold in 
classical mechanics. 

Two forms for the equations of motion of quantum mechanics have 
now been given. Of these, the Schrodinger form is the more useful 
one for practical problems, as it provides the simpler equations. The 
unknowns in Schrodinger’s wave equation are the numbers which 
form the 'representative of a ket vector, while Heisenberg’s equation 
of motion for a dynamical variable, if expressed in terms of a repre¬ 
sentation, would involve as unknowns the numbers forming the 
representative of the dynamical variable. The latter are far more 
numerous and therefore more difficult to evaluate than the Schro¬ 
dinger unknowns. Heisenberg’s form for the equations of motion is 
of value in providing an immediate analogy with classical mechanics 
and enabling one to see how various features of classical theory, such 
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as the conservation laws referred to above, are translated into quan¬ 
tum theory. 

29. Stationary states 

We shall here deal with a dynamical system whose energy is con¬ 
stant. Certain specially simple relations hold for this case. Equation 
(6) can be integrated! to give 

rp ___ e -imt-h)l&^ 

with the help of the initial condition that T = 1 for t = t 0 . This 
result substituted into (1) gives 

I Pty = p* 0 >, (16) 

which is the integral of Sehrodinger’s equation of motion (5), and 
substituted into (10) it gives 

v — 0)/» > (17) 

which is the integral of Heisenberg’s equation of motion (11), being 
now equal to H. Thus we have solutions of the equations of motion 
in a simple form. However, these solutions are not of much practical 
value, because of the difficulty involved in evaluating the operator 
unless H is particularly simple, and for practical purposes 
one usually has to fall back on Schrodinger’s wave equation. 

Let us consider a state of motion such that at time t 0 it is an eigen¬ 
state of the energy. The ket |P£ 0 > corresponding to it at this time 
must be an eigenket of H. If H f is the eigenvalue to which it belongs, 
equation (16) gives = 

showing that |P£> differs from | PZ 0 > only by a phase factor. Thus 
the state always remains an eigenstate of the energy, and further, it 
does not vary with the time at all, since the direction of the ket \Pt ) 
does not vary with the time. Such a state is called a stationary state. 
The probability for any particular result of an observation on it is 
independent of the time when the observation is made. From our 
assumption that the energy is an observable, there are sufficient 
stationary states for an arbitrary state to be dependent on them. 

The time-dependent wave function ip(gt) representing a stationary 
state of energy H' will vary with time according to the law 

*K&) = (18) 

f The integration can be carried out as though. H were an ordinary algebraic 
variable instead of a linear operator, because there is no quantity that does not 
commute with H in the work. 
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t 29 

and Schrodinger’s wave equation (7) for it reduces to 

m 0 > = £%>■ (i9) 

This equation merely asserts that the state represented by *p 0 is an 
eigenstate of H. We call a function ifi Q satisfying (19) an eigenfunction 
of H , belonging to the eigenvalue H f . 

In the Heisenberg picture the stationary states correspond to fixed 
eigenvectors of the energy. We can set up a representation in which 
all the basic vectors are eigenvectors of the energy and so correspond 
to stationary states in the Heisenberg picture. We call such a repre¬ 
sentation a Heisenberg representation. The first form of quantum 
mechanics, discovered by Heisenberg in 1925, was in terms of a 
representation of this kind. The energy is diagonal in the representa¬ 
tion. Any other diagonal dynamical variable must commute with the 
energy and is therefore a constant of the motion. The problem of 
setting up a Heisenberg representation thus reduces to the problem 
of finding a complete set of commuting observables, each of which 
is a constant of the motion, and then making these observables 
diagonal. The energy must be a function of these observables, from 
Theorem 2 of § 19. It is sometimes convenient to take the energy 
itself as one of them. 

Let a denote the complete set of commuting observables in a 
Heisenberg representation, so that the basic vectors are written <a'|, 

|a"). The energy is a function of these observables a, say H = H{a). 
From (17) we get 

= \ v | a ">, ( 20 ) 

where H r = H(&') and H" = The factor <a'i'y|a"> on the right- 

hand side here is independent of t, being an element of the matrix 
representing the fixed linear operator v. Formula (20) shows how the 
Heisenberg matrix elements of any Heisenberg dynamical variable 
vary with time, and it makes v t satisfy the equation of motion (11), 
as is easily verified. The variation given by (20) is simply periodic 
with the frequency 

\H'-~H"\l2irh= \H'-H"\lh, (21) 

depending only on the energy difference of the two stationary states 
to which the matrix element refers. This result is closely connected 
with the Combination Law of Spectroscopy and Bohr’s Frequency 
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Condition, according to which (21) is the frequency of the electro¬ 
magnetic radiation emitted or absorbed when the system makes a 
transition under the influence of radiation between the stationary 
states ol and a", the eigenvalues of H being Bohr’s energy levels. 
These matters will be dealt v r ith in § 45. 

30. The free particle 

The most fundamental and elementary application of quantum 
mechanics is to the system consisting merely of a free particle, or 
particle not acted on by any forces. For dealing with it we use as 
dynamical variables the three Cartesian coordinates x, y , z and their 
conjugate momenta p x , p y , p z . The Hamiltonian is equal to the 
kinetic energy of the particle, namely 

H = ^(P%+Z>l+P z z ) ( 22 ) 

according to Newtonian mechanics, m being the mass. This formula 
is valid only if the velocity of the particle is small compared with c, 
the velocity of light. For a rapidly moving particle, such as we often 
have to deal with in atomic theory, (22) must be replaced by the 
relativistic formula 

H = c{m 2 c 2 +p'l+vl+pD*. (23) 

For small values of p x , p y) and p z (23) goes over into (22), except for 
the constant term me 2 which corresponds to the rest-energy of the 
particle in the theory of relativity and which has no influence on the 
equations of motion. Formulas (22) and (23) can be taken over 
directly into the quantum theory, the square root in (23) being now 
understood as the positive square root defined at the end of § 11. 
The constant term me 2 by which (23) differs from (22) for small values 
of p x , p y , and p z can still have no physical effects, since the Hamil¬ 
tonian in the quantum theory, as introduced in § 27, is undefined to 
the extent of an arbitrary additive real constant. 

We shall here work with the more accurate formula (23). We shall 
first solve the Heisenberg equations of motion. From the quantum 
conditions (9) of § 21, p x commutes with p y and p ZJ and hence, from 
Theorem 1 of § 19 extended to a set of commuting observables, p x 
commutes with any function of p x , p y , and p z and therefore with H. 
It follows that p x is a constant of the motion. Similarly p y and p z are 
constants of the motion. These results are the same as in the classical 
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theory. Again, the equation of motion for a coordinate, x t say, is, 
according to (11), 

ihx t — = x t c(m 2 c 2 -rpl~ 2 f-+pl) l ‘—c{m 2 c 2j rpl+pl+pl) ix t . 


The right-hand side here can be evaluated by means of formula 
(31) of § 22 with the roles of coordinates and momenta interchanged, 
so that it reads 


Qrf-fyr = dfldPr 


(24) 


/ now being any function of the p’s. This gives 
x r . ■■~c{m 2 c 2 -{-pl+pl+pl)i = 

Similarly, y, = , z t = ^. 


(25) 


The magnitude of the velocity is 

v = (zf+2/f+z?)* = (?(pl+pl+plflH. (26) 

Equations (25) and (26) are just the same as in the classical theory. 

Let us consider a state that is an eigenstate of the momenta, 
belonging to the eigenvalues p’ x , p' y) p’ z . This state must be an eigen¬ 
state of the Hamiltonian, belonging to the eigenvalue 

H’ — c{m 2 c 2 +p 2 +p y 2 +p 2 )l, (27) 

and must therefore be a stationary state. The possible values for H' 
are all numbers from me 2 to oo, as in the classical theory. The wave 
function ifj{xyz) representing this state at any time in Schrodinger’s 
representation must satisfy 

p' x i/j(xyz)y = p x if)(xyz)y = — 

with similar equations for p y and p z . These equations show that 
ijj(xyz) is of the form 

ifj(xyz) = (28) 

where a is independent of x , y, and z. From (18) we see now that the 
time-dependent wave function i/f{xyzt) is of the form 

ifj(xyzt) = a 0 eW* x +Kv + v* z - H 'W, (29) 

where a 0 is independent of x , y, z , and t. 

The function (29) of x , y, z, and t describes plane waves in space- 
time. We see from this example the suitability of the terms 'wave 
function’ and 'wave equation’. The frequency of the waves is 

v = H'/K 


( 30 ) 
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their wavelength is 

A = hl(&+P?+pW = hlP f , (31) 

P' being the length of the vector (p' x9 Py,p' g ) 9 and their motion is in 
the direction specified by the vector (Px,p' y) Ps) with the velocity 

Xv = H'/P ' = c 2 /V, (32) 

v' being the velocity of the particle corresponding to the momentum 
(PxiPyiPz) as given by formula (26). Equations (30), (31), and (32) 
are easily seen to hold in all Lorentz frames of reference, the expres¬ 
sion on the right-hand side of (29) being, in fact, relativistically 
invariant with p x ,p y ,p z and H f as the components of a 4-vector. 
These properties of relativistic invariance led de Broglie, before the 
discovery of quantum mechanics, to postulate the existence of waves 
of the form (29) associated with the motion of any particle. They 
are therefore known as de Broglie waves. 

In the hmiting case when the mass m is made to tend to zero, the 
classical velocity of the particle v becomes equal to c and hence, from 
(32), the wave velocity also becomes c. The waves are then like the 
light-waves associated with a photon, with the difference that they 
contain no reference to the polarization and involve a complex ex¬ 
ponential instead of sines and cosines. Formulas (30) and (31) are 
still valid, connecting the frequency of the light-waves with the 
energy of the photon and the wavelength of the light-waves with 
the momentum of the photon. 

For the state represented by (29), the probability of the particle 
being found in any specified small volume when an observation of its 
position is made is independent of where the volume is. This provides 
an example of Heisenberg’s principle of uncertainty, the state being 
one for which the momentum is accurately given and for which, in 
consequence, the position is completely unknown. Such a state is, 
of course, a limiting case which never occurs in practice. The states 
usually met with in practice are those represented by wave packets, 
which may be formed by superposing a number of waves of the type 
(29) belonging to slightly different values of (p x >Py,Pz)i as discussed 
in § 24. The ordinary formula in hydrodynamics for the velocity of 
such a wave packet, i.e. the group velocity of the waves, is 

dv 

dm 


( 33 ) 
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which gives, from (30) and (31) 

§= < 34 > 
This is just the velocity of the particle. The wave packet moves in 
the same direction and with the same velocity as the particle moves 
in classical mechanics. 

31. The motion of wave packets 

The result just deduced for a free particle is an example of a general 
principle. For any dynamical system with a classical analogue, a state 
for which the classical description is valid as an approximation is 
represented in quantum mechanics by a wave packet, all the co¬ 
ordinates and momenta having approximate numerical values, whose 
accuracy is limited by Heisenberg’s principle of uncertainty. Now 
Schrodinger’s wave equation fixes how such a wave packet varies with 
time, so in order that the classical description may remain valid, the 
wave packet should remain a wave packet and should move according 
to the laws of classical dynamics. We shall verify that this is so. 

We take a dynamical system having a classical analogue and let 
its Hamiltonian be H(q r ,p r ) (r = 1 , 2 ,..., n). The corresponding classi¬ 
cal dynamical system will have as Hamiltonian H c (q r) p r ) say, obtained 
by putting ordinary algebraic variables for the q r and p r in H(q r ,p r ) 
and making ft -*> 0 if it occurs in H(q r ,p r ). The classical Hamiltonian 
H c is, of course, a real function of its variables. It is usually a 
quadratic function of the momenta p r , but not always so, the 
relativistic theory of a free particle being an example where it is not. 
The following argument is valid for H c any algebraic function of the p’s. 

We suppose that the time-dependent w T ave function in Schro¬ 
dinger’s representation is of the form 

ifj(qt) = AeW*, (35) 

where A and S are real functions of the q ’s and t which do not vary 
very rapidly with their arguments. The wave function is then of the 
form of waves, with A and S determining the amplitude and phase 
respectively. Schrodinger’s wave equation (7) gives 

ih^Ae iS ^y = H(q r ,p r )Ae iSln > 

ot 

{^-^}> = e-iWH(q r ,p r )Ae m >- 


or 


(36) 



122 


THE EQUATIONS OF MOTION 


§31 


Now e- iSIh is evidently a unitary linear operator and may be used for 
U in equation (70) of § 26 to give us a unitary transformation. The 
q f s remain unchanged by this transformation, each p r goes over into 
e -isihp r e ism _ p r -{-dS/dq r9 

with the help of (31) of § 22, and H goes over into 
e-m H (q r , Pr )e^ = H(q r ,p r +dSldq r ), 

since algebraic relations are preserved by the transformation. Thus 
(36) becomes 

^ 4-^4 )> - 4” p ’ + D^ (3,) 

Let us now suppose that h can be counted as small and let us neglect 
terms involving h in (37). This involves neglecting the p r ’ s that occur 
in H in (37), since each p r is equivalent to the operator ~-ihd/dq r 
operating on the functions of the q's to the right of it. The surviving 



This is a differential equation which the phase function S has to 
satisfy. The equation is determined by the classical Hamiltonian 
function H c and is known as the Hamilton-Jacobi equation in classical 
dynamics. It allows S to be real and so shows that the assumption 
of the wave form (35) does not lead to an inconsistency. 

To obtain an equation for A, we must retain the terms in (37) 
which are linear in H and see what they give. A direct evaluation of 
these terms is rather awkward in the case of a general function H , 
and we can get the result we require more easily by first multiplying 
both sides of (37) by the bra vector <4/, where / is an arbitrary real 
function of the q’s. This gives 

The conjugate complex equation is 

< A f{- a 4- A 4\>-< AB [^+^if A >- 

Subtracting and dividing out by ifi, we obtain 

KA f ° A > _ 


( 39 ) 
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We now have to evaluate the P.B. 

[f,H(q r ,p r +dS/dq r )]. 

Our assumption that ft can be counted as small enables us to expand 
H(q r , p r + dSjdq r ) as a power series in the p ’s. The terms of zero degree 
will contribute nothing to the P.B. The terms of the first degree in 
the p’s give a contribution to the P.B. which can be evaluated most 
easily with the help of the classical formula (1) of § 21 (this formula 
being valid also in the quantum theory if u is independent of the ^’s 
and v is linear in the #’s). The amount of this contribution is 

y 8 / pff(gy»ffr) l 

d 'p s 


-1 Vr= 


the notation meaning that we must substitute dS/dq r for each p r in 
the function [ ] of the q’s and p’s, so as to obtain a function of the q’s 
only. The terms of higher degree in the p’s give contributions to the 
P.B. which vanish when ft 0. Thus (39) becomes, with neglect of 
terms involving ft, which is equivalent to the neglect of ft 2 in (37), 




>• 


(40) 


\p r =dSl8q r 

Now if a(q) and b(q) are any two functions of the q’s, formula 
(64) of§ 20 gives <<l(g)i>(5)> = J a( , W 


and so 


<«(5)»> 




(41) 


dq r 8q r 

provided a(q) and b(q) satisfy suitable boundary conditions, as dis¬ 
cussed in §§ 22 and 23. Hence (40) may be written 


</—> 

dt 7 




L d z> s 


>. 


\pr=8Sldq r < 

Since this holds for an arbitrary real function /, we must have 


3A\ 

dt 


2 


A 2 


8H c {q r ,p r ) 


8 Ps 


J Pr=dSISg, 


)• 


(42) 


This is the equation for the amplitude A of the wave function. To 
get an understanding of its significance, let us suppose we have a fluid 
moving in the space of the variables q, the density of the fluid at any 
point and time being A 2 and its velocity being 


dq s m c {q r ,p r ] 
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Equation (42) is then just the equation of conservation for such a 
fluid. The motion of the fluid is determined by the function S 
satisfying (38), there being one possible motion for each solution 
of (38). 

For a given S, let us take a solution of (42) for which at some 
definite time the density A 2 vanishes everywhere outside a certain 
small region. We may suppose this region to move with the fluid, 
its velocity at each point being given by (43), and then the equation 
of conservation (42) will require the density always to vanish outside 
the region. There is a limit to how small the region may be, imposed 
by the approximation we made in neglecting ft in (39). This approxi¬ 
mation is valid only provided 


* 3 „ 

ft — A 

■ 28 a 

:— A. 

$ 


1 8A 

^ i 8S 

Adq'/ 

< % 2q r ’ 


which requires that A shall vary by an appreciable fraction of itself 
only through a range of the q’s in which S varies by many times ft, 
i.e. a range consisting of many wavelengths of the wave function (35). 
Our solution is then a wave packet of the type discussed in § 24 and 
remains so for all time. 

We thus get a wave function representing a state of motion for 
which the coordinates and momenta have approximate numerical 
values throughout all time. Such a state of motion in quantum 
theory corresponds to the states with which classical theory deals. 
The motion of our wave packet is determined by equations (38) and 
(43). From these we get, defining p s as dS/dq s , 

dp 8 __ d 3S _ d*S , V 8*S dq u 
dt 




dt 


dt dq, dtdq, 

--44-S!)+? 


d*S eH c (q r ,p r ) 


8 Pu 


- l$r’Pr) 

%ts 

where in the last line the p’& are counted as independent of the q’& 
before the partial differentiation. Equations (43) and (44) are just 
the classical equations of motion in Hamiltonian form and show that 
the wave packet moves according to the laws of classical mechanics. 
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We see in this way how the classical equations of motion are derivable 
from the quantum theory as a limiting case. 

By a more accurate solution of the wave equation one can show 
that the accuracy with which the coordinates and momenta simul¬ 
taneously have numerical values cannot remain permanently as 
favourable as the limit allowed by Heisenberg’s principle of un¬ 
certainty, equation (56) of § 24, but if it is initially so it will become 
less favourable, the wave packet undergoing a spreading.! 

32. The action principle! 

Equation (10) shows that the Heisenberg dynamical variables at 
time t, v t , are connected with their values at time t 0 , v to , or v, by a 
unitary transformation. The Heisenberg variables at time 2-j-Stf are 
connected with their values at time t by an infinitesimal unitary 
transformation, as is shown by the equation of motion (11) or (13), 
which gives the connexion between v t+ % and v t of the form of (79) or 
(80) of § 26 with H t for F and Bt/fi for e. The variation with time of 
the Heisenberg dynamical variables may thus be looked upon as the 
continuous unfolding of a unitary transformation. In classical 
mechanics the dynamical variables at time t+St are connected with 
their values at time t by an infinitesimal contact transformation and 
the whole motion may be looked upon as the continuous unfolding of a 
contact transformation. We have here the mathematical foundation 
of the analogy between the classical and quantum equations of 
motion, and can develop it to bring out the quantum analogue of all 
the main features of the classical theory of dynamics. 

Suppose we have a representation in w r hich the complete set of 
commuting observables £ are diagonal, so that a basic bra is <£'|. 
We can introduce a second representation in w 7 hich the basic bras are 

<f*l = <f I T. (45) 

The new basic bras depend on the time t and give us a moving 
representation, like a moving system of axes in an ordinary vector 
space. Comparing (45) with the conjugate imaginary of (8), we see 
that the new basic vectors are just the transforms in the Heisenberg 
picture of the original basic vectors in the Schrodinger picture, and 
hence they must be connected with the Heisenberg dynamical 

f See Kennard, Z.f. Physik, 44 (1927), 344; Darwin, Proc. Boy. Soc. A, 117 (1927), 
258. 

! t This section may be omitfced by the student who is not specially concerned with 
higher dynamics. 
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variables in the same way in which the original basic vectors are 
connected with the Sehrodinger dynamical variables v. In particular, 
each <£'*| must be an eigenvector of the i/s belonging to the eigen¬ 
values i '. It may therefore be written <£|, with the understanding 
that the numbers it are the same eigenvalues of the i/s that the i n s 
are of the i’s. From (45) we get 

<sir> = <mr>, ( 46 ) 

showing that the transformation function is just the representative 
of T in the original representation. 

Differentiating (45) with respect to t and using (6), we get 

«|<si = <£’\ht = <m 

with the help of (12). Multiplying on the right by any ket |o> 
independent of t, we get 

|«> = = J <£t\a>, (47) 

if we take for definiteness the case of continuous eigenvalues for the 
fs. Now equation (5), written in terms of representatives, reads 

m |<n pty = J <f \H\r> d? <rip*>. ( 48 ) 

Since is the same function of the variables and that 

<£'|I7|£"> is of i’ and i\ equations (47) and (48) are of precisely the 
same form, with the variables i r t ,g' t in (47) playing the role of the 
variables and i" in (48) and the function <^|a> playing the role 
of the function (i'\Pt). We can thus look upon (47) as a form of 
Schrodinger’s wave equation, with the function <$| a) of the variables 
it as the wave function. In this way Schrodinger’s wave equation 
appears in a new light , as the condition on the representative , in the 
moving representation with the Heisenberg variables i t diagonal , of the 
fixed ket corresponding to a state in the Heisenberg picture. The function 
<S|a> owes its variation with time to its left factor <g|, in contra¬ 
distinction to the function <f' |P2>, which owes its variation with time 
to its right factor | Pty. 

If we put |a) = \i") in (47), we get 

= J <&<&\r>. 


(49) 
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showing that the transformation function <40 satisfies Sehro- 
dinger’s wave equation. Now 4 — £, so we must have 

<410 - 8(4-0 (50) 

the 8 function here being understood as the product of a number of 
factors, one for each 4 va ™ble, such as occurs for the variables 
4+i>-->4 on fh e right-hand side of equation (34) of § 16. Thus the 
transformation function <40 is that solution of Schrodinger’s wave 
equation for which the £’s certainly have the values 4 at time t 0 
The square of its modulus, |<401 2 > is the relative probability of the 
4s having the values 4 at time t > t Q if they certainly have the values 
4 at time t 0 . We may write <40 as <44> an( l consider it as 
depending on t 0 as well as on t. To get its dependence on t 0 we take 
the conjugate complex of equation (49), interchange t and t 0 and also 
interchange single primes and double primes. This gives 

= J <m:> < m 

The foregoing discussion of the transformation function <40 is 
valid with the fs any complete set of commuting observables. The 
equations were written down for the case of the g’s having continuous 
eigenvalues, but they would still be valid if any of the 4 s have 
discrete eigenvalues, provided the necessary formal changes are made 
in them. Let us now take a dynamical system having a classical 
analogue and let us take the 4 to be the coordinates q. Put 

<40==^* (52) 

and so define the function S of the variables 4 q”. This function also 
depends explicitly on t. (52) is a solution of Schrodinger’s wave 
equation and, if ft can be counted as small, it can be handled in the 
same way as (35) was. The S of (52) differs from the S of (35) on 
account of there being no A in (52), which makes the S of (52) com¬ 
plex, but the real part of this S equals the S of (35) and its pure 
imaginary part is of the order ft. Thus, in the limit ft -> 0, the 8 of 
(52) will equal that of (35) and will therefore satisfy, corresponding 

t0 (38) ’ -esjdt = Ho^p'n), (53) 

where p rt = 8S/dq' ri , (54) 

and H c is the Hamiltonian of the classical analogue of our quantum 
dynamical system. But (52) is also a solution of (51) with q 7 s for 4s, 
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which is the conjugate complex of Schrodinger’s wave equation in the 
variables q" or q to . This causes S to satisfy alsof 

8S/8t 0 = H c (&Pr)> ( 55 > 

where jp r = —dSjdq ff T . (56) 

The solution of the Hamilton-Jacobi equations (53), (55) is the 
action function of classical mechanics for the time interval t Q to £, 
i.e. it is the time integral of the Lagrangian L, 

t 

J L(t') dt\ (57) 

to 

Thus the S defined by (52) is the quantum analogue of the classical action 
function and equals it in the limit h -> 0. To get the quantum analogue 
of the classical Lagrangian, we pass to the case of an infinitesimal 
time interval by putting t = t 0 +St and we then have <#* 0+ s/|g? 0 > as the 
analogue of e iL(to) ^ h . For the sake of the analogy, one should consider 
L(t 0 ) as a function of the coordinates q f at time t 0 +§t and the co¬ 
ordinates q" at time t Qi rather than as a function of the coordinates 
and velocities at time t 0 , as one usually does. 

The principle of least action in classical mechanics says that the 
action function (57) remains stationary for small variations of the tra¬ 
jectory of the system which do not alter the end points, i.e. for small 
variations of the qf s at all intermediate times between t 0 and t with q tQ 
and q t fixed. Let us see what it corresponds to in the quantum theory. 

Put expji J L{t) dtjh j = exp {iS(t b , t a )/K} — B(t„, t a ), (58) 

so that B(t b ,t a ) corresponds to (.q'tfq'tf) in the quantum theory. (We 
here allow q' ta and q' tb to denote different eigenvalues of q ta and q tb , to 
save having to introduce a large number of primes into the analysis.) 
Now suppose the time interval t 0 t to be divided up into a large 
number of small time intervals t 0 t l3 t x t 2 ,... } t m _ x -> t m , t m -* t 3 by 
the introduction of a sequence of intermediate times t v t m . Then 

B{t 3 tf) = B(t, t m )B(t m , i m _ 1 )...-B(^ 2 j^ i)^(^ij £q). (59) 

The corresponding quantum equation, which follows from the pro¬ 
perty of basic vectors (35) of § 16, is 

<&|2o> = //-/ ‘fyXS'llS'o). 

(60) 

f For a more accurate comparison of transformation functions with classical 
theory, see Van Vleck, Proc . Nat. Acad. 14, 178. 
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q k being written for q tk for brevity. At first sight there does not seem 
to be any close correspondence between (59) and (60). We must, 
however, analyse the meaning of (59) rather more carefully. We must 
regard each factor B as a function of the g’s at the two ends of the 
time interval to which it refers. This makes the right-hand side of 
(59) a function, not only of q t and q to , but also of all the intermediate 
q’s. Equation (59) is valid only when we substitute for the inter¬ 
mediate q ’s in its right-hand side their values for the real trajectory, 
small variations in which values leave S stationary and therefore also, 
from (58), leave B(t,t 0 ) stationary. It is the process of substituting 
these values for the intermediate q's which corresponds to the inte¬ 
grations over all values for the intermediate q n s in (60). The quantum 
analogue of the action principle is thus absorbed in the composition 
law (60) and the classical requirement that the values of the inter¬ 
mediate shall make S stationary corresponds to the condition 
in quantum mechanics that all values of the intermediate q n s 
are important in proportion to their contribution to the integral 
in (60). 

Let us see how (59) can be a limiting case of (60) for ft small. We 
must suppose the integrand in (60) to be of the form e iF!h } where F is 
a function of which remains continuous as ft tends 

to zero, so that the integrand is a rapidly oscillating function when 
ft is small. The integral of such a rapidly oscillating function wall be 
extremely small, except for the contribution arising from a region in 
the domain of integration where comparatively large variations in 
the q k produce only very small variations in F. Such a region must 
be the neighbourhood of a point where F is stationary for small varia¬ 
tions of the q k . Thus the integral in (60) is determined essentially by 
the value of the integrand at a point where the integrand is stationary 
for small variations of the intermediate q n s, and so (60) goes over 
into (59). 

Equations (54) and (56) express that the variables q' t ,p' t are con¬ 
nected with the variables q",p" by a contact transformation and are 
one of the standard forms of writing the equations of a contact trans¬ 
formation. There is an analogous form for writing the equations of a 
. unitary transformation in quantum mechanics. We get from (52), with 
the help of (45) of § 22, 

< ( tt\Prtte"> = = 8 - ~r- * <&'!?"> • 


3595.57 


(61) 
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Similarly, with the help of (46) of § 22, 

<flJl2Vl« r > = W> = (62) 

From the general definition of functions of commuting observables, 

we have <maMSt)V> -/(fiWKflJlj'). (63) 

where f(q t ) and g(q) are functions of the q t * s and q’ s respectively. Let 
G(q t ,q) be any function of the q t 9 s and g’s consisting of a sum or 
integral of terms each of the form f(q t )g(q), so that all the q t 9 s in G 
occur to the left of all the q 9 s. Such a function we call well ordered. 
Applying (63) to each of the terms in G and adding or integrating, 

we §et <s&\Odb, q) I q"> = G(q t ,q"Kqi\q”). 

Now let us suppose each p H and p r can be expressed as a well-ordered 
function of the qj s and q 9 s and write these functions p ri (q t , q),pMt> $)• 
Putting these functions for G , we get 

<Qt\PrlW> = PjStlhf)<S&\f>, 

Comparing these equations with (61) and (62) respectively, we see 


that 

pM>€) = 

This means that 

Prl = 


8S(qiq") 


SS{q t ,q) 


pM,q") = 


am,q") 


K 


Hrt 


_8S(q l ,q) 
T Hr 


(64) 


provided the right-hand sides of (64) are written as well-ordered 
functions. 

These equations are of the same form as (54) and (56), but refer to 
the non-commuting quantum variables q t , q instead of the ordinary 
algebraic variables q' t} q" . They show how the conditions for a unitary 
transformation between quantum variables are analogous to the condi¬ 
tions for a contact transformation between classical variables. The 
analogy is not complete, however, because the classical S must be real 
and there is no simple condition corresponding to this for the S of (64). 


33. The Gibbs ensemble 

In our work up to the present w r e have been assuming all along that 
our dynamical system at each instant of time is in a definite state, 
that is to say, its motion is specified as completely and accurately as 
is possible without conflicting with the general principles of the theory. 
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In the classical theory this would mean, of course, that all the coordi¬ 
nates and momenta have specified values. Now we may be interested 
in a motion which is specified to a lesser extent than this maximum 
possible. The present section will be devoted to the methods to be 
used in such a case. 

The procedure in classical mechanics is to introduce what is called 
a Gibbs ensemble , the idea of which is as follows. We consider all the 
dynamical coordinates and momenta as Cartesian coordinates in a 
certain space, the phase space , whose number of dimensions is twice 
the number of degrees of freedom of the system. Any state of the 
system can then be represented by a point in this space. This point 
will move according to the classical equations of motion (14). Sup¬ 
pose, now, that we are not given that the system is in a definite state 
at any time, but only that it is in one or other of a number of possible 
states according to a definite probability law. We should then be 
able to represent it by a fluid in the phase space, the mass of fluid in 
any volume of the phase space being the total probability of the 
system being in any state whose representative point lies in that 
volume. Each particle of the fluid will be moving according to the 
equations of motion (14). If we introduce the density p of the fluid 
at any point, equal to the probability per unit volume of phase space 
of the-system being in the neighbourhood of the corresponding state, 
we shall have the equation of conservation 



This may be considered as the equation of motion for the fluid, since 
it determines the density p for all time, if p is given initially as a 
function of the q’s and p's. It is, apart from the minus sign, of the 
same form as the ordinary equation of motion (15) for a dynamical 
variable. 

The requirement that the total probability of the system being in 
any state shall be unity gives us a normalizing condition for p 

jjp dqdp = 1, (66) 

the integration being over the whole of phase space and the single 
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differential dq or dp being written to denote the product of all the 
dq’s or dp’ s. If /3 denotes any function of the dynamical variables, 
the average value of /S will be 

fjfodqdp. (67) 

It makes only a trivial alteration in the theory, but often facilitates 
discussion, if we work with a density p differing from the above one 
by a positive constant factor, k say, so that we have instead of (66) 

JJ p dqdp = k. 

With this density we can picture the fluid as representing a number 
k of similar dynamical systems, all following through their motions 
independently in the same place, without any mutual disturbance or 
interaction. The density at any point would then be the probable or 
average number of systems in the neighbourhood of any state per unit 
volume of phase space, and expression (67) would give the average 
total value of j8 for all the systems. Such a set of dynamical systems, 
which is the ensemble introduced by Gibbs, is usually not realizable 
in practice, except as a rough approximation, but it forms all the 
same a useful theoretical abstraction. 

We shall now see that there exists a corresponding density p 
in quantum mechanics, having properties analogous to the above. 
It w r as first introduced by von Neumann. Its existence is rather 
surprising in view of the fact that phase space has no meaning in 
quantum mechanics, there being no possibility of assigning numerical 
values simultaneously to the ^’s and ^? 5 s. 

We consider a dynamical system which is at a certain time in one 
or other of a number of possible states according to some given 
probability law. These states may be either a discrete set or a con¬ 
tinuous range, or both together. We shall here take for definiteness 
the case of a discrete set and suppose them labelled by a parameter m. 
Let the normalized ket vectors corresponding to them be |m> and let 
the probability of the system being in the rath state be P m . We then 
define the quantum density p by 

P = X |m>P m <rn|. (68) 

m 

Let p’ be any eigenvalue of p and Ip') an eigenket belonging to this 
eigenvalue. Then 

2 |rn>P m <m|p'> = p|p'> = p'|p'> 
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2 <P'l m > P rn< m lp'> = p'<p'|p'> 

m 

Or 2 P rnl< m lp'>| 2 = P'<PV>- 

Now P m , being a probability, can never be negative. It follows that 
p cannot be negative. Thus p has no negative eigenvalues, in analogy 
with the fact that the classical density p is never negative. 

Let us now obtain the equation of motion for our quantum p. In 
Schrodinger’s picture the kets and bras in (68) will vary with the time 
in accordance with Schrodinger’s equation (5) and the conjugate 
imaginary of this equation, while the P m ’s will remain constant, since 
the system, so long as it is left undisturbed, cannot change over from 
a state corresponding to one ket satisfying Schrodinger’s equation to 
a state corresponding to another. We thus have 

= 2 {H |m>p m <rn [ - \m>P m (m \H) 

771 

= Hp — pH. (69) 

This is the quantum analogue of the classical equation of motion 
(65). Our quantum p , like the classical one, is determined for all time 
if it is given initially. 

From the assumption of § 12, the average value of any observable 
P when the system is in the state m is <m|/?|ra>. Hence if the system 
is distributed over the various states m according to the probability 
law P m , the average value of /3 will be 2 P m <^li8|m). If we introduce 

m 

a representation with a discrete set of basic ket vectors ||'> say, this 
equals 

: P m <™\£'><f l/9|«> = 2 <f |/3|m>P m <m|f > 

= 2<f|/3plf> = 2<ri Pj8|f>, (70) 

£' i' 

the last step being easily verified with the law of matrix multiplica¬ 
tion, equation (44) of § 17. The expressions (70) are the analogue of 
the expression (67) of the classical theory. Whereas in the classical 
theory we have to multiply /3 b j p and take the integral of the 
product over all phase space, in the quantum theory we have to 
multiply p by p, with the factors in either order, and take the 
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diagonal sum of the product in a representation. If the representa¬ 
tion involves a continuous range of basic vectors X we get instead 

of (70) J <f IjSplf > a? = J <f \pm at', (71) 

so that we must carry through a process of 'integrating along the 
diagonal’ instead of summing the diagonal elements. We shall define 
(71) to be the diagonal sum of /?p in the continuous case. It can easily 
be verified, from the properties of transformation functions (56) of 
§18, that the diagonal sum is the same for all representations. 

From the condition that the |m>’s are normalized we get, with 
discrete £ n s 

2 <f IpIO = 2 <nm>P m <rn|r> = 2P m = 1 , (72) 

g'm rn 

since the total probability of the system being in any state is unity. 
This is the analogue of equation (66). The probability of the system 
being in the state or the probability of the observables £ which 
are diagonal in the representation having the values £', is, according 
to the rule for interpreting representatives of kets (51) of § 18, 

(73) 

which gives us a meaning for each term in the sum on the left-hand 
side of (72). For continuous £ n s, the right-hand side of (73) gives the 
probability of the £’s having values in the neighbourhood of £' per 
unit range of variation of the values £'. 

As in the classical theory, we may take a density equal to k times 
the above p and consider it as representing a Gibbs ensemble of k 
similar dynamical systems, between which there is no mutual dis¬ 
turbance or interaction. We shall then have k on the right-hand side 
of (72), and (70) or (71) will give the total average /? for all the 
members of the ensemble, while (73) will give the total probability 
of a member of the ensemble having values for its £’s equal to £' 
or in the neighbourhood of £' per unit range of variation of the 
values £'. 

An important application of the Gibbs ensemble is to a dynamical 
system in thermodynamic equilibrium with its surroundings at a 
given temperature T. Gibbs showed that such a system is repre¬ 
sented in classical mechanics by the density 

p = ce -H/A;T, 


(74) 



i 33 THE GIBBS ENSEMBLE 135 

H being the Hamiltonian, which is now independent of the t im e, h 
being Boltzmann’s constant, and c being a number chosen to make 
the normalizing condition (66) hold. This formula may be taken over 
unchanged into the quantum theory. At high temperatures, (74) 
becomes p = c, which gives, on being substituted into the right-hand 
side of (73), c<£' |£ ; > = c in the case of discrete £ /5 s. This shows that 
at high temperatures all discrete states are equally probable . 



VI 
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34. The harmonic oscillator 

A simple and interesting example of a dynamical system in quantum 
mechanics is the harmonic oscillator. This example is of importance 
for general theory, because it forms a corner-stone in the theory of 
radiation. The dynamical variables needed for describing the system 
are just one coordinate q and its conjugate momentum p. The 
Hamiltonian in classical mechanics is 


H 


2m 


(p 2 -f m 2 6u 2 g 2 ), 


(1) 


where m is the mass of the oscillating particle and w is 2 n times the 
frequency. We assume the same Hamiltonian in quantum mechanics. 
This Hamiltonian, together with the quantum condition (10) of § 22, 
define the system completely. 

The Heisenberg equations of motion are 


it = [?/,#] = Ptl™> 
Vt = [VtM] = ~ma 


It is convenient to introduce the dimensionless complex dynamical 


variable 


7j = (2mh(jo)~ i (p-i-imcx)q). 


(3) 


The equations of motion (2) give 


rj t = — mco 2 q t -\-iti)'pj) = icor) t . 


This equation can be integrated to give 

(4) 

where 7] 0 is a linear operator independent of t, and is equal to the 
value of rj t at time t = 0. The above equations are all as in the 
classical theory. 

We can express q and p in terms of rj and its conjugate complex rj 
and may thus work entirely in terms of rj and rj. We have 

fhcarjrj = (2m) _1 (jp+ irruoq) (p—irncoq) 

= (2 m)- 1 [p 2 +m 2 aj 2 q*+ima)(qp~~pq)] 

= H — \Hoj 

fhcorjrj = H+lfio). 
rjr]—rjrj = 1, 


and similarly 
Thus 


( 8 ) 

( 6 ) 

(V 
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Equation (5) or (6) gives H in terms of rj and rj and (7) gives the 
commutation relation connecting r] and rj. Erom (5) 

hajrjrjrj = rjH — ^hcorj 
and from (6) Rajfjrjrj = Hrj-^ffioofj, 

Thus rjH—Hrj = (8) 

Also, (7) leads to rjrj n —r} n 7j = nrf 1 - 1 (9) 

for any positive integer n, as may he verified by induction, since, by 
multiplying (9) by rj on the left, we can deduce (9) with n+l for n. 

Let 27' be an eigenvalue of 27 and |27'> an eigenket belonging to it. 
Erom (5) 


fa><JBL'\w\H'y = <27' = (H' |27'>. 

Now {H'\rjrj\H'} is the square of the length of the ket fj\H r }, and 

hence <J?'|^|JS' , > > 0, 

the case of equality occurring only if = 0. Also > 0. 

Thus H’ > (10) 


the case of equality occurring only if rj\H'y = 0. From the form (1) 
of H as a sum of squares, we should expect its eigenvalues to be all 
positive or zero (since the average value of 27 for any state must be 
positive or zero.) We now have the more stringent condition (10). 

From (8) 


H^H'y = (yjH-Rcorj) |£T> = (H'-nu>)rj\H'y (11) 

Now if 27' ^ rj\H'} is not zero and is then according to (11) an 
eigenket of 27 belonging to the eigenvalue H'—fta>. Thus, with 27' 
any eigenvalue of H not equal to jRco, H'—ftco is another eigenvalue 
of 27. We can repeat the argument and infer that, if H'—Rco 
H'~~2Roj is another eigenvalue of H . Continuing in this way, we 
obtain the series of eigenvalues 27', 27' — Roj, H'—SRod, 27'— Shco,..., 
which cannot extend to infinity, because then it would contain eigen¬ 
values contradicting (10), and can terminate only with the value \Rw. 
Again, from the conjugate complex of equation (8) 

H v \H f y = (rjH+R^lE'y = (H'+RwWH'y, 
showing that H'-{-Roj is another eigenvalue of 27, with rj\H'y as an 
eigenket belonging to it, unless r}\H f y = 0. The latter alternative 
can be ruled out, since it would lead to 


0 = Hu>rjr}\ H') = (H+iRoj)\H'y = {H'+$fito)\H'y, 
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which contradicts (10). Thus is always another eigenvalue 

of H, and so are H'+ 2&d, H'+Sfia) and so on. Hence the eigenvalues 
of H are the series of numbers 

pco, po>, Pco, .... (12) 

extending to infinity. These are the possible energy values for the 

harmonic oscillator. 

Let |0> be an eigenket of H belonging to the lowest eigenvalue 
Pco, so that jj|o> _ ( 13 ) 

and form the sequence of kets 

|0>, 7)\0}, r) 2 |0>, p 3 |0>, (14) 

These kets are all eigenkets of H, belonging to the sequence of eigen¬ 
values (12) respectively. From (9) and (13) 

Vl n \^y ~ ^^ n_:L |0) (15) 

for any non-negative integer n. Thus the set of kets (14) is such that 
7} or rj applied to any one of the set gives a ket dependent on the set. 
Now all the dynamical variables in our problem are expressible in terms 
of 7] and ij , so the kets (14) must form a complete set (otherwise there 
would be some more dynamical variables). There is just ope of these 
kets for each eigenvalue (12) of H, so H by itself forms a complete 
commuting set of observables. The kets (14) correspond to the various 
stationary states of the oscillator. The stationary state with energy 
{n -\-corresponding to p n |0>, is called the n-th quantum state. 
The square of the length of the ket tj 71 |0> is 

<o|7fv|o> = 

with the help of (15). By induction, we find that 

<0|7f7f|0> = n\ (16) 

provided |0> is normalized. Thus the kets (14) multiplied by the 
coefficients n\~* with n = 0 , 1, 2 ,..., respectively form the basic kets 
of a representation, namely the representation with H diagonal. Any 
ket \xy can be expanded in the form 

l«> = $ x nV n |0>> (17) 

where the x n ’s are numbers. In this way the ket \xy is put into 
correspondence with a power series 2 x n V n * n the variable rj, the 
various terms in the power series corresponding to the various 
stationary states. If \xy is normalized, it defines a state for which 
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the probability of the oscillator being in the nth quantum state, 
i.e. the probability of H having the value is 

P n = w!|^| 2 , (18) 

as follows from the same argument which led to (51) of § 18. 

We may consider the ket |0> as a standard ket and the power series 
in rj as a wave function, since any ket can be expressed as such a 
wave function multiplied into this standard ket. The present kind 
of wave function differs from the usual kind, introduced by equations 
(62) of § 20, in that it is a function of the complex dynamical variable 
rj instead of observables. It is, however, for many purposes the most 
convenient wave function to use for describing states of the harmonic 
oscillator. The standard ket |0> satisfies the condition (13), which 
replaces the conditions (43) of § 22 for the standard ket in Schro- 
dinger’s representation. 

Let us introduce Schrodinger’s representation with q diagonal and 
obtain the representatives of the stationary states. From (13) and (3) 

(p—ima)q)\0y — 0, 
so (q'\p—imooq |0> = 0. 

With the help of (45) of § 22, this gives 

^^7<3'|0>+mo J g'< ? '|0> = 0. (19) 

dq 

The solution of this differential equation is 

<#'| 0 > = {mtol7Tft)h~ moi ^ 2n 9 ( 20 ) 

the numerical coefficient being chosen so as to make |0> normalized. 
We have here the representative of the normal state, as the state of 
lowest energy is called. The representatives of the other stationary 
states can be obtained from it: We have from (3) 

<2'h n |0> = (2mftco)~- nl \q' |(p+ima^) n |0> 

= (2mhco)~ nl2 i n ^~~fi + moiq' 

= ^(2mM~ n,2 (»W^) i (—^ JL+mco^ (21) 

This may easily be worked out for small values of n. The result is of 
the form of e~ ma)q '^ m times a power series of degree n in q'. A further 
factor must be inserted in (21) to get the normalized representa¬ 
tive of the wth quantum state. The factor i n may be discarded, being 
merely a phase factor. 
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35. Angular momentum 

Let us consider a particle described by the three Cartesian coordi¬ 
nates x, y , z and their conjugate momenta p x , p y , p z . Its angular 
momentum about the origin is defined as in the classical theory, by 

m x = yp z -zp y m y = zp x -xp z m z = xp y —yp x , ( 22 ) 

or by the vector equation 

m = xxp. 

We must evaluate the P.B.s of the angular momentum components 
with the dynamical variables x, p x , etc., and with each other. This 
we can do most conveniently with the help of the laws (4) and (5) of 
§ 21, thus 

[m z , x] = [xp v —yp x ,x] = —y[p x ,x] = y, 

[m z ,y] = \xp y -yp x ,y\ - x[p y ,y] = -a:, 

[*»»*] = [xp y -yp x ,z\ = 0, 

and similarly, 

[m z ,p x ]=p v , [m z ,p y ] = -p x , 

[m e ,p z ] = 0 , 

with corresponding relations for m x and m y . Again 

[m y ,m z ] = [zp x -xp z , mj = z[p x ,m z ]-[x,m z ]p ; 

- -ZPy+yPz = m x , 

[m z , m x ] = m y , [m x , m y ] = m z . 

These results are all the same as in the classical theory. The sign in 
the results (23), (25), and (27) may easily be remembered from the 
rule that the -f sign occurs when the three dynamical variables, con¬ 
sisting of the two in the P.B. on the left-hand side and the one 
forming the result on the right, are in the cyclic order (xyz) and the 
— sign occurs otherwise. Equations (27) may be put in the vector 

^ orm m x m = ifim. (28) 


(23) 

(24) 

(25) 

(26) 

(27) 


Now suppose we have several particles with angular momenta 
m l5 m 2 ,.... Each of these angular momentum vectors will satisfy 


(28), thus 


m r x m r = vHm ri 


and any one of them will commute with any other, so that 


m r x m s +m s xm r = 0 (r^ 5 ), 
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Hence if M — 2 m r * s total angular momentum, 

r 

MxM = 2 ni r xm s = £ m r xm r + ]£ (m r xm s +m s xm r ) 

rs r t<s 

= ih ^ m r = iK M. (29) 

This result is of the same form as (28), so that the components of the 
total angular momentum M of any number of particles satisfy the 
same commutation relations as those of the angular momentum of 
a single particle. 

Let A x , A y , A z denote the three coordinates of any one of the 
particles, or else the three components of momentum of one of 
the particles. The A’s will commute with the angular momenta of 
the other particles, and hence from (23), (24), (25), and (26) 

[M z , A x ) = Ay, [M z , A y ] = -A x , [J4^y = 0. (30) 

If B x , By, B z are a second set of three quantities denoting the 
coordinates or momentum components of one of the particles, they 
will satisfy similar relations to (30). We shall then have 

[M z , A x B x +A y B y +A z B z ] 

= [M z , A X \B X +A X [M S , B X ]+[M Z , A y ]B v +A y [M z B y ] 

= A y B x +A x B y —A x B y —AyB x 

= 0 . 

Thus the scalar product A x B x +A y B y ~\-A z B z commutes with M z , 
and similarly with M x and M y . Introduce the vector product 

AxB - C 

or 

A y B z — A z B y = C x , A z B x — A X B Z == C y , A x B v —A y B x = C z . 

We have [M z , C x ] = -A x B z +A z B x = C v 

and similarly [M z ,G y ] = —C x , [M Z ,C Z ] = 0. 

These equations are again of the form (30), with C for A. We can 
conclude from this work that equations of the form (30) hold for the 
three components of any vector that we can construct from our 
dynamical variables, and that any scalar commutes with M. 

We can introduce linear operators R referring to rotations about 
the origin in the same way in which we introduced the linear operators 
D in § 25 referring to displacements. Taking a rotation through an 
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angle 8 <f> about the z- axis and making S<£ infinitesimal, we can obtain 
the limit operator corresponding to (64) of § 25, 

lim (jR- -1)/S <j>, 

which we shall call the rotation operator about the 2 -axis and denote 
by r z . Like the displacement operators, r z is a pure imaginary linear 
operator and is undetermined to the extent of an arbitrary additive 
pure imaginary number. Corresponding to (66) of § 25, the change 
in any dynamical variable v caused by a rotation through a small 
angle S</> about the 2 -axis is 

8 <f>(r z v—vr z ), (31) 

to the first order in h<j>. Now the changes produced in the three 
components A x , A y , A z of a vector by a (right-handed) rotation S<£ 
about the 2 -axis applied to all measuring apparatus are 8 cf>A y , 
—8 (j>A xi and 0 respectively, and any scalar quantity is unchanged by 
the rotation. Equating these changes to (31), we find that 

r z A x —A x r z = A y , r m A 9l -A y r z = — A n 

r z A z -A z r z = 0, 

and r z commutes with any scalar. Comparing these results with (30), 
we see that ihr z satisfies the same commutation relations as M z . 
Their difference, M z —ihr z) commutes with all the dynamical variables 
and must therefore be a number. This number, which is necessarily 
real since M z and ihr z are real, may be made zero by a suitable choice 
of the arbitrary pure imaginary number that can be added to r z . We 
then have the result M z = ihr z . (32) 

Similar equations hold for M x and M y . They are the analogues of (69) 
of § 25. Thus the total angular momentum is connected with the rota¬ 
tion operators as the total momentum is connected ivith the displacement 
operators . This conclusion is valid for any point as origin. 

The above argument applies to the angular momentum arising 
from the motion of particles, defined by (22) for each particle. There 
is another kind of angular momentum occurring in atomic theory, 
spin angular momentum. The former kind of angular momentum will 
be called orbital angular momentum , to distinguish it. The spin angu¬ 
lar momentum of a particle should be pictured as due to some internal 
motion of the particle, so that it is associated with different degrees 
of freedom from those describing the motion of the particle as a whole, 
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and hence the dynamical variables that describe the spin must com¬ 
mute with x , y , z , p x , p y) and p z . The spin does not correspond very 
closely to anything in classical mechanics, so the method of classical 
analogy is not suitable for studying it. However, we can build up a 
theory of the spin simply from the assumption that the components 
of the spin angular momentum are connected with the rotation opera¬ 
tors in the same way as we had above for orbital angular momentum, 
i.e. equation (32) holds with M z as the z component of the spin angular 
momentum of a particle and r z as the rotation operator about the 
2 ~axis referring to states of spin of that particle. With this assump¬ 
tion, the commutation relations connecting the components of the 
spin angular momentum M with any vector A referring to the spin 
must be of the standard form (30), and hence, taking A to be the 
spin angular momentum itself, we have equation (29) holding also 
for the spin. We now have (29) holding quite generally, for any sum 
of spin and orbital angular momenta, and also (30) will hold generally, 
for M the total spin and orbital angular momentum and A any vector 
dynamical variable, and the connexion between angular momentum 
and rotation operators will be always valid. 

As an immediate consequence of this connexion, we can deduce the 
law of conservation of angular momentum. For an isolated system, the 
Hamiltonian must be unchanged by any rotation about the origin, in 
other words it must be a scalar, so it must commute with the angular 
momentum about the origin. Thus the angular momentum is a 
constant of the motion. For this argument the origin may be any 
point. 

As a second immediate consequence, we can deduce that a state 
with zero total angular momentum is spherically symmetrical. The state 
will correspond to a ket |$>, say, satisfying 

M x \sy = M y \sy = M z \sy = o, 

and hence rJ/S) = r^/S) = r z \8y = 0. 

This shows that the ket |$> is unaltered by infinitesimal rotations, 
and it must therefore be unaltered by finite rotations, since the latter 
can be built up from infinitesimal ones. Thus the state is spherically 
symmetrical. The converse theorem, a spherically symmetrical state 
has zero total angular momentum , is also true, though its proof is not 
quite so simple. A spherically symmetrical state corresponds to a ket 
| Sy whose direction is unaltered by any rotation. Thus the change 
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in |S> produced by a rotation operator r x) r y , or r z must be a numerical 
multiple of |$>, say 

r x |£> = c x |/S>, r v \sy = c v \Sy, r z \sy = c s \sy, 
where the c’s are numbers. This gives 

M x \Sy = itic x \S>, M y \S> = i1ie y \8\ 

M z \Sy = iftc 3 \Sy. (33) 

These equations are not consistent with the commutation relations 
(29) for M xi M yi M z unless c x — c y = c z = 0, in which case the state 
has zero total angular momentum. We have in (33) an example of 
a ket which is simultaneously an eigenket of the three non-commuting 
linear operators M x , M y , M z , and this is possible only if all three 
eigenvalues are zero. 

36. Properties of angular momentum 

There are some general properties of angular momentum, deducible 
simply from the commutation relations between the three compo¬ 
nents. These properties must hold equally for spin and orbital angular 
momentum. Let m x , m y , m z be the three components of an angular 
momentum, and introduce the quantity /3 defined by 

J8 = 

Since j8 is a scalar it must commute with m x , m y , and m z . Let us 
suppose we have a dynamical system for which m x> m y , m z are the 
only dynamical variables. Then jS commutes with everything and 
must be a number. We can study this dynamical system on much 
the same lines as we used for the harmonic oscillator in § 34. 

Put m x —im y — rj. 

From the commutation relations (27) we get 
VV = ( m x+ im y)( m x~~ im y) = 


= (34) 

and similarly qrj = ( 35 ) 

Thus rjrj—r]rj= 2Hm z . (36) 

Also m z 7]—rjm z = ihm y — : Hm x = — fir). (37) 


We assume that the components of an angular momentum are 
observables and thus m z has eigenvalues. Let m z be one of them, 
and |m'> an eigenket belonging to it. From (34) 

= < m « |m'> = (j3— K 2 +^KK^\ m z>- 
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The left-hand side here is the square of the length of the ket -q \m^> 
and is thus greater than or equal to zero, the case of equality occur¬ 
ring if and only if -q \m' z } — 0. Hence 

jS— m'g 2 -\-hm' z > 0, 

or /3+P 2 > «-P) 2 . ^ 

Thus P+W > °- 

Defining the number k by 

fc+P = (P+lh*?- = {ml+ml+ml+lh 2 )*, (39) 


so that k > -p, the inequality (38) becomes 

fc+P > |m'—PI 

or k+b >m' z > -k. (4=0) 

An equality occurs if and only if = 0. Similarly from (35) 

= (^— m 'e — bm z )(in z \m z y, 

showing that bm z ^ 0 

or k^m',.^ —k—b, 

with an equahty occurring if and only if i?K> = 0. This result 
combined with (40) shows that k > 0 and 

k'^m' z '^ —k, ( 41 ) 


with m z — k if ij 
From (37) 

m z t] 


|w'> = 0 and m’ z = — k if = 0. 

j OT '> == {rjm z —brj)\m z y = {m z —b)r)\m z y. 


Now if m' P —k, q\m! z ) is not zero and is then an eigenket of m z 
belonging to the eigenvalue m z —b. Similarly, if m’ z —b # —k, m z —2b 
is another eigenvalue of m z , and so on. We get in this way a senes 

of eigenvalues m' z , m’ z -b, m' z -2b .which must terminate from (41), 

and can terminate only with the value -k. Again, from the conjugate 
complex of equation (37) 

m z ij\m, z y = ( rjm z +brj) \m' z y = (m z +b)ij\m z y, 


showing that m' z +b is another eigenvalue of m z unless i)K> - 0, m 
which case m z = k. Continuing in this way we get a senes of eigen¬ 
values m' z ,m’ z +b,m' z +2b,..., which must terminate from (41), and 
can terminate only with the value k. We can conclude that 2k is an 
integral multiple of b and that the eigenvalues of m z are 

k, k-b, k-2b, -k+h, -k. (42) 


3696.57 
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The eigenvalues of m x and m y are the same, from symmetry. These 
eigenvalues are all integral or half odd integral multiples of ft, accord¬ 
ing to whether 2k is an even or odd multiple of ft. 

Let |max> be an eigenket of m z belonging to the maximum eigen¬ 


value k 3 so that 


7?jmax> = 0, 


(43) 


and form the sequence of kets 


|max>, | max), ?? 2 |max>, ..., ^ 2fc ^|max>. (44) 


These kets are all eigenkets of m z> belonging to the sequence of eigen¬ 
values (42) respectively. The set of kets (44) is such that the operator 
7 ] applied to any one of them gives a ket dependent on the set (rj 
applied to the last gives zero), and from (36) and (43) one sees 
that rj applied to any one of the set also gives a ket dependent on the 
set. All the dynamical variables for the system we are now dealing 
with are expressible in terms of rj and rj, so the set of kets (44) is a 
complete set. There is just one of these kets for each eigenvalue (42) 
of m z , so m z by itself forms a complete commuting set of observables. 

It is convenient to define the magnitude of the angular momentum 
vector m to be k , given by (39), rather than /?*, because the possible 


values for k are 


0, £ft, ft, |ft, 2ft, ..., 


(45) 


extending to infinity, while the possible values for are a more 
complicated set of numbers. 

For a dynamical system involving other dynamical variables besides 
m x , m y , and m 2 , there may be variables that do not commute with /?. 
Then jS is no longer a number, but a general linear operator. This 
happens for any orbital angular momentum (22), as x, y, z, p x , p y3 and 
p s to not commute with j3. We shall assume that j3 is always an 
observable, and k can then be defined by (39) with the positive square 
root function and is also an observable. We shall call k so defined 


the magnitude of the angular momentum vector m in the general 
case. The above analysis by which we obtained the eigenvalues of 
m z is still valid if we replace \m z y by a simultaneous eigenket \k'm' z } 
of the commuting observables k and m z , and leads to the result that 
the possible eigenvalues for k are the numbers (45), and for each 
eigenvalue ¥ of k the eigenvalues of m z are the numbers (42) with k' 
substituted for ¥ We have here an example of a phenomenon which 
we have not met with previously, namely that with two commuting 
observables, the eigenvalues of one depend on what eigenvalue we 
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assign to the other. This phenomenon may be understood as the two 
observables being not altogether independent, but partially functions 
of one another. The number of independent simultaneous eigenkets 
of k and m z belonging to the eigenvalues k ' and m z must be indepen¬ 
dent of m z , since for each independent \k'm' z ) we can obtain an 
independent for any m" in the sequence (42), by multiplying 

\k'm z y by a suitable power of rj or rj. 

As an example let us consider a dynamical system with two angular 
momenta m 1 and m 2 , which commute with one another. If there are 
no other dynamical variables, then all the dynamical variables com¬ 
mute with the magnitudes k x and k 2 of mi and m 2 , so k x and k 2 are 
numbers. However, the magnitude K of the resultant angular 
momentum M = irq+irq is not a number (it does not commute 
with the components of mq and m 2 ) and it is interesting to work out 
the eigenvalues of K. This can be done most simply by a method 
of counting independent kets. There is one independent simultaneous 
eigenket of m lz and m 2z belonging to any eigenvalue having one of 
the values k v k x — Aq—2#,..., —k x and any eigenvalue m 2z having one 
of the values k 2 , k 2 — : h i k 2 —2fi ,..., —k 2 , and this ket is an eigenket 
of M g belonging to the eigenvalue M z = rn{ z +m 2z . The possible 
values of M' z are thus k 1 +k 2f k 1 +k 2 --fi,k 1 ^ r k 2 ~-2fi,... 7 -~k 1 ~-k 2) and 
the number of times each of them occurs is given by the following 
scheme (if we assume for definiteness that k x ^ k 2 ), 

k^~\~k 2 ) k^~\~k 2 fiy k-^~\~k 2 2^,..., k ^ k 2i k ^ k 2 $>,... 

1 2 3 ... 2k 2 +l 2k 2 +l ... 

(46) 

... — k-^ —I— k 2i —— k-^ *i~ lc 2 Hi )..., — k ^— k 2 
2k 2 +l 2k» 

Now each eigenvalue K ' of K will be associated with the eigenvalues 
K' } K'—K, K'—2fi 1 ..., —K' for M z , with the same number of indepen¬ 
dent simultaneous eigenkets of K and M z for each of them. The total 
number of independent eigenkets of M z belonging to any eigenvalue 
Mg must be the same, whether we take them to be simultaneous 
eigenkets of m ^ and m 2z or simultaneous eigenkets of K and M si i.e. 
it is always given by the scheme (46). It follows that the eigenvalues 
for K are 

k^-\~k 2 , & 1+&2 ki k 2 -, (4:7) 

and that for each of these eigenvalues for K and an eigenvalue for 
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M g going with it there is just one independent simultaneous eigenket 
of K and M z . 

The effect of rotations on eigenkets of angular momentum variables 
should be noted. Take any eigenket | M' z } of the z component of total 
angular momentum for any dynamical system, and apply to it a small 
rotation through an angle Scf> about the 2 -axis. It will change into 

(1+8WI M' z > = (l-i8<l>M 0 /ft)\M f z > 
with the help of (32). This equals 

= erWW] M' e > 

to the first order in S <j>. Thus \M' Z } gets multiplied by the numerical 
factor By applying a succession of these small rotations, we 

find that the application of a finite rotation through an angle <f> about 
the 2 -axis causes | M' z } to get multiplied by Putting $ = 27t, 

we find that an application of one revolution about the 2 -axis leaves 
| JO unchanged if the eigenvalue M' z is an integral multiple of ft and 
causes \M ' z > to change sign if M' z is half an odd integral multiple of ft. 
Now consider an eigenket |Z'> of the magnitude K of the total angu¬ 
lar momentum. If the eigenvalue iT is an integral multiple of ft, the 
possible eigenvalues of M z are all integral multiples of ft and the applica¬ 
tion of one revolution about the 2 -axis must leave | K'} unchanged. 
Conversely, if K f is half an odd integral multiple of ft, the possible eigen¬ 
values of M z are all half odd integral multiples of ft and the revolution 
must change the sign of \K'y. Prom symmetry, the application of a 
revolution about any other axis must have the same effect on \K'} 
as one about the 2 -axis. We thus get the general result, the application 
of one revolution about any axis leaves a ket unchanged or changes its 
sign according to whether it belongs to eigenvalues of the magnitude of 
the total angular momentum which are integral or half odd integral 
multiples of ft. A state, of course, is always unaffected by the revolu¬ 
tion, since a state is unaffected by a change of sign of the ket corre¬ 
sponding to it. 

Por a dynamical system involving only orbital angular momenta, 
a ket must be unchanged by a revolution about an axis, since we can 
set up Schrodinger’s representation, with the coordinates of all the 
particles diagonal, and the Schrodinger representative of a ket will 
get brought back to its original value by the revolution. It follows 
that the eigenvalues of the magnitude of an orbital angular momentum 
are always integral multiples of ft. The eigenvalues of a component 
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of an orbital angular momentum are also always integral multiples 
of ii. For a spin angular momentum, Schrodinger’s representation 
does not exist and both kinds of eigenvalue are possible. 


37. The spin of the electron 

Electrons, and also some of the other fundamental particles (pro¬ 
tons, neutrons) have a spin whose magnitude is p. This is found 
from experimental evidence, and also there are theoretical reasons 
showing that this spin value is more elementary than any other, even 
spin zero (see Chapter XI). The study of this particular spin is there¬ 
fore of special importance. 

For dealing with an angular momentum m whose magnitude is p, 


it is convenient to put 


m 


Pc. 


(48) 


The components of the vector o then satisfy, from (27), 

°y<y—e z vy = 2 icr x 

ov ov—= 2 (49) 


(J x°y~~ CJ y 0 x = 2ic a . 

The eigenvalues of m z are p and — p, so the eigenvalues of o z are 1 
and — 1, and a\ has just the one eigenvalue 1. It follows that a| must 
equal 1, and similarly for a% and oj, i.e. 

a|=a| = al=l. (50) 

We can get equations (49) and (50) into a simpler form by means of 
some straightforward non-commutative algebra. From (50) 

c%a z —o z crl = 0 

or or »K°'*— 1 a z a v) + ( a v°z—' a z°y) a v = 0 

or a y <j x J r ar x a y = 0 

with the help of the first of equations (49). This means g x u y — —u y o x . 
Two dynamical variables or linear operators like these which satisfy 
the commutative law of multiplication except for a minus sign will 
be said to anticommute . Thus o x anticommutes with u y . From sym¬ 
metry each of the three dynamical variables o x , a y , o z must anti¬ 
commute with any other. Equations (49) may now be written 


a y°z ~ = —G z G yJ 

°z (J x = i(y y = ~ a x a z> 
==:: === 

°a: u y °z ~ 


(51) 

(52) 


and also from (50) 
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Equations (50), (51), (52) are the fundamental equations satisfied by 
the spin variables a describing a spin whose magnitude is 

Let us set up a matrix representation for the a s and let us take <j z 
to be diagonal. If there are no other independent dynamical variables 
besides the m’s or <r’s in our dynamical system, then a z by itself forms 
a complete set of commuting observables, since the form of equations 
(50) and (51) is such that we cannot construct out of a x , a y , and <j z 
any new dynamical variable that commutes with cr z . The diagonal 
elements of the matrix representing <j z being the eigenvalues 1 and 
— 1 of cr 2) the matrix itself will be 


Let <j x be represented by ( 1 

\a 3 a 4 

This matrix must be Hermitian, so that a x and a 4 must be real and 
a 2 and a z conjugate complex numbers. The equation a z o x = —cr x or z 
gives us / - n \ u __ n \ 


/ a. 

«2\ _ _ 

K 


l-«3 

—aj 

\«3 

—aj 


so that a x = a A = 0. Hence a x is represented by a matrix of the form 

/0 a 2 \ 

W 0/ 

The equation a| = 1 now shows that a 2 a 3 = 1. Thus a 2 and a 3 , being 
conjugate complex numbers, must be of the form e i<x and e~ ia re¬ 
spectively, where a is a real number, so that a x is represented by a 
matrix of the form , « 


Similarly it may be shown that a y is also represented by a matrix of 
this form. By suitably choosing the phase factors in the representa¬ 
tion, which is not completely determined by the condition that a z 
shall be diagonal, we can arrange that a x shall be represented by the 
matrix . 


The representative of a y is then determined by the equation 
We thus obtain finally the three matrices 

(! S)- (“ D- & -?)• < 63 > 
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to represent a x) cr y , and cr z respectively, which matrices satisfy all the 
algebraic relations (49), (50), (51), (52). The component of the vector 
0 in an arbitrary direction specified by the direction cosines Z, m, n , 
namely la x +ma y +ncr z , is represented by 

U. ! ir)- ««» 

The representative of a ket vector will consist of just two numbers, 
corresponding to the two values +1 and — 1 for a z . These two num¬ 
bers form a function of the variable cr' whose domain consists of only 
the two points +1 and — 1. The state for which cr z has the value unity 
will be represented by the function, f a (cr' z ) say, consisting of the pair 
of numbers 1, 0 and that for which a z has the value —1 will be 
represented by the function, say, consisting of the pair 0, 1. 

Any function of the variable 0 ', i.e. any pair of numbers, can be 
expressed as a linear combination of these two. Thus any state can 
be obtained by superposition of the two states for which cr z equals +1 and 
— 1 respectively. For example, the state for w r hich the component of 
a in the direction Z, m, n, represented by (54), has the value +1 is 
represented by the pair of numbers a, b which satisfy 


or 


Thus 


n l—im\(a\ __ fa\ 
l+im —n )\bj ~~ \Z>/ 
na+{l—im)b = a, 
(l+im)a—nb = 6. 

a ___ l—im 1 +n 

b l—n l-\-im ‘ 


This state can be regarded as a superposition of the two states for 
which cr z equals +1 and — 1, the relative weights in the superposition 
process being as 

|a| 2 : |6| 2 = \l—im\* : (1— n) 2 = 1 -\-n: 1— n. (55) 

For the complete description of an electron (or other elementary 
particle with spin -p) we require the spin dynamical variables cr, 
whose connexion with the spin angular momentum is given by (48), 
together with the Cartesian coordinates x , y, z and momenta p x , p yi 
p z . The spin dynamical variables commute with these coordinates 
and momenta. Thus a complete set of commuting observables for a 
system consisting of a single electron will be x, y , z, a z . In a repre¬ 
sentation in which these are diagonal, the representative of any state 
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will be a function of four variables x\ y', z o*'. Since cr z has a domain 
consisting of only two points, namely 1 and — 1, this function of four 
variables is the same as two functions of three variables, namely the 
two functions 

<*'yVl>+ = <x\y',z',+l\>, <z'y'z f |>_ = -li>. (56) 

Thus the presence of the spin may be considered either as introducing a 
new variable into the representative of a state or as giving this representa¬ 
tive two components. 

38. Motion in a central field of force 

An atom consists of a massive positively charged nucleus together 
with a number of electrons moving round, under the influence of the 
attractive force of the nucleus and their own mutual repulsions. An 
exact treatment of this dynamical system is a very difficult mathe¬ 
matical problem. One can, however, gain some insight into the main 
features of the system by making the rough approximation of regard¬ 
ing each electron as moving independently in a certain central field 
of force, namely that of the nucleus, assumed fixed, together with 
some kind of average of the forces due to the other electrons. Thus 
our present problem of the motion of a particle in a central field of 
force forms a corner-stone in the theory of the atom. 

Let the Cartesian coordinates of the particle, referred to a system 
of axes with the centre of force as origin, be x, y, z and the corre¬ 
sponding components of momentum p x , p y , p z . The Hamiltonian, 
with neglect of relativistic mechanics, will be of the form 

H = ll2m.(j>l+pl+pl)+V, (57) 

where F, the potential energy, is a function only of (x 2 -\-y 2 -[-z 2 ). To 
develop the theory it is convenient to introduce polar dynamical 
variables. We introduce first the radius r , defined as the positive 
square root r = {x * +1 f +z *)i. 

Its eigenvalues go from 0 to co. If we evaluate its P.B.s with p x , p y , 
and p z} we obtain, with the help of formula (32) of § 22, 

the same as in the classical theory. We introduce also the dynamical 
variable p r defined by 

Pt = r ~ 1 (zpx+yPv+zp»)- 


(58) 
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Its P.B. with r is given by 

A r >PA = [ r > r Pr] = [r,xp x +ypy+zp z ] 

= x[r,p x ]+y[r,p y ]+z[r,p z ] 

= x.xjr-\-y .yjr-^z.zjr = r. 

Hence [r,p r ] = 1 

Or r Pr“~Pr r — 

The commutation relation between r and _p r is just the one for a 
canonical coordinate and momentum, namely equation (10) of § 22. 
This makes p r like the momentum conjugate to the r coordinate, but 
it is not exactly equal to this momentum because it is not real, its 
conjugate complex being 

pr = (Px^+PyV+Pz^ 1 = {xp x +yp y +zp z -ZiH)r- 1 

= (rp r —$ifi,)r- 1 = p r —2ihr~ 1 . (59) 

Thus p r —ihr~ 1 is real and is the true momentum conjugate to r. 

The angular momentum m of the particle about the origin is given 
by (22) and its magnitude Jc is given by (39). Since r and p r are 
scalars, they commute with m, and therefore also with Jc. 

We can express the Hamiltonian in terms of r, p r , and Jc. We have, 
if 2 denotes a sum over cyclic permutations of the suffixes x } y, z, 

xyz 

k{k+n) = 2 m l = 2 ( x Py-yPx) 2 

xyz xyz 

= 2 (®p* xpy+ypx ypx-xpv vPx-yPx *p„) 

xyz 

= 2 (x 2 pl+yYx-xPxP v y-yPvPxX+ xi pl- x PxPxZ- 

xyz 

— 2 ihxp x ) 

= Y+y^+z^ipl+pl+pl)- 

-(xPx+yPv+ z Pz)(Px : < : +Pyy+Pz z + 2i ti) 

= r*{pl+pl+pl)-rp r {p r r+ 2iK) 

= r 2 (.Pl+Pl+Pl)— r Pr r - 
from (59). Hence 

H = 7 L(±p*r+ k{k+fi) +V. (60) 

2 m\r r r“ 

This form for H is such that k commutes not only with H, as is 
necessary since k is a constant of the motion, but also with every 
dynamical variable occurring in //, namely r, p T , and V, which is a 
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function of r. In consequence, a simple treatment becomes possible, 
namely, we may consider an eigenstate of k belonging to an eigen¬ 
value k' and then we can substitute k' for k in (60) and get a problem 
in one degree of freedom r. 

Let us introduce Schrodinger’s representation with x , y , z diagonal. 
Then p y9 p z are equal to the operators —ih d/dx, —ih d/dy , — ih djdz 
respectively. A state is represented by a wave function ifs(xyzt) satis¬ 
fying Schrodinger’s wTive equation (7) of § 27, which now reads, with 
H given by (57), 

a l-(-£(£+f.+£) +,, K < 6i > 

We may pass from the Cartesian coordinates x,y,z to the polar 
coordinates r, 6 , <j> by means of the equations 


x = rsin0cos</>, 
y = rsindsmcj), 
z = r cos Q , 


(62) 


and may express the wave function in terms of the polar coordinates, 
so that it reads ift The equations (62) give the operator equation 

d __ dx d dy 3 d ___ x d y d z d 
dr Br dx dr dy dr dz r dx r dy' r dz 5 

which shows, on being compared with (58), that p r — —ih d/dr . Thus 
Schrodinger’s wave equation reads, with the form (60) for H, 

{£(-£,+*£3)+^. <es, 

Here k is a certain linear operator which. since it co mm utes with t 
and 8 [dr, can involve only 9, <f>, 8/89, and 8/8<j>. From the formula 

k(k+h) - m|+m*+m§, ( 64 ) 

which comes from (39), and from (62) one can work out the form of 
k(k-\-h) and one finds 


k(k+fi) _ 1 8 . a 8 1 

W " ~sin 9 89 89~^e 8^- (65) 

This operator is well known in mathematical physics. Its eigen¬ 
functions are called spherical harmonics and its eigenvalues are 
n{n-\-l) where n is an integer. Thus the theory of spherical har¬ 
monics provides an alternative proof that the eigenvalues of k are 
integral multiples of h. 
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For an eigenstate of k belonging to the eigenvalue nh (n a non¬ 
negative integer) the wave function will be of the form 

<P = r~ x x{rt)S n {dj>), ( 66 ) 

where S n (6(f>) satisfies 

k(k+K)S n (0cf>) = n(n+l)WS n (e<f>), (67) 

i.e. from (65) S n is a spherical harmonic of order n. The factor r _1 
is inserted in (66) for convenience. Substituting (66) into (63), we 
get as the equation for x 

<68) 

If the state is a stationary state belonging to the energy value H 
X will be of the form ^ = Xo{r)e -mm 

and (68) will reduce to 

This equation may be used to determine the energy-levels H' of the 
system. For each solution Xo °f (69), arising from a given n, there 
will be 2n+l independent states, because there are 2?i+l indepen¬ 
dent solutions of (67) corresponding to the 2 ti+ 1 different values 
that a component of the angular momentum, m z say, can take on. 

The probability of the particle being in an element of volume 
dxdydz is proportional to \ip\ 2 dxdydz. With ip of the form (66) this 
becomes r- 2 \x\ 2 \S n \ 2 dxdydz. The probability of the particle being in 
a spherical shell between r and r-\-dr is then proportional to \x\ 2 dr. 
It now becomes clear that, in solving equation (68) or (69), we must 
impose a boundary condition on the function x &t r — 0, namely the 

function must be such that the integral to the origin J \x\ 2 dr is 

o 

convergent. If this integral were not convergent, the wave function 
would represent a state for which the chances are infinitely in favour 
of the particle being at the origin and such a state would not be 
physically admissible. 

The boundary condition at r = 0 obtained by the above considera¬ 
tion of probabilities is, however, not sufficiently stringent. We get a 
more stringent condition by verifying that the wave function obtained 
by solving the wave equation in polar coordinates (63) really satisfies 
the wave equation in Cartesian coordinates (61). Let us take the case 
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of V = 0, giving us the problem of the free particle. Applied to a 
stationary state with energy H r = 0, equation (61) gives 

V 2 (/r = 0, (70) 

where V 2 is written for the Laplacian operator d 2 /dx 2 +d 2 ldy 2 +8 2 jdz 2 i 
and equation (63) gives 


dr 2 H 2 r 2 ) 


= 0 . 


(71) 


A solution of (71) for 1c = 0 is ifs = r~\ This does not satisfy 
(70), since, although W" 1 vanishes for any finite value of r , its integral 
through a volume containing the origin is —-477 (as may be verified 
by transforming this volume integral to a surface integral by means 
of Gauss’s theorem), and hence 


W 1 = -47T S(x)8{y)S(z). (72) 


Thus not every solution of (71) gives a solution of (70), and more 
generally, not every solution of (63) is a solution of (61). We must 
impose on the solution of (63) the condition that it shall not tend to 
infinity as rapidly as r- 1 when r -» 0 in order that, when substituted 
into (61), it shall not give a § function on the right like the right-hand 
side of (72). Only when equation (63) is supplemented with this condi¬ 
tion does it become equivalent to equation (61). We thus have the 
boundary condition rip 0 or x -> 0 as r -> 0. 

There are also boundary conditions for the wave function at r = oo. 
If we are interested only in Closed 5 states, i.e. states for which the 
particle does not go off to infinity, we must restrict the integral to 

CO 

infinity J |^(r)| 2 dr to be convergent. These closed states, however, 
are not the only ones that are physically permissible, as we can also 
have states in which the particle arrives from infinity, is scattered 
by the central field of force, and goes off to infinity again. For these 
states the wave function may remain finite as r -> oo. Such states will 
be dealt with in Chapter VIII under the heading of collision problems. 
In any case the wave function must not tend to infinity as r -> oo, or 
it will represent a state that has no physical meaning. 


39. Energy-levels of the hydrogen atom 
The above analysis may be applied to the problem of the hydrogen 
atom with neglect of relativistic mechanics and the spin of the 
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electron. The potential energy V is nowf — e 2 /r, so that equation 
(69) becomes 


d 2 n(n-j-l) 2me 2 1 
dr 2 r 2 ft 2 r 


Xo ~ — 


A thorough investigation of this equation has been given by Schro- 
dinger.J We shall here obtain its eigenvalues H' by an elementary 
argument. 

It is convenient to put 

Xo =f(r)e~ r l a , (74) 

introducing the new function f(r), where a is one or other of the 
square roots ffl _ (75) 

Equation (73) now becomes 


d 2 2d , 2me 2 1 

dr 2 a dr r 2 r 


j/(r) = 0. 


We look for a solution of this equation in the form of a power series 

/(') = 2V* (77) 


in which consecutive values for s differ by unity although these 
values themselves need not be integers. On substituting (77) in (76) 
we obtain 

2 c s {s(s—l)r 5 ~ 2 — (2s/a)r s ’~ 1 —n(n+l)r s - 2j r (2me 2 jh 2 )r 8 - 1 } = 0, 

s 

which gives, on equating to zero the coefficient of r 5 ~ 2 , the following 
relation between successive coefficients c 89 

1)—= c s ^_ 1 [2(s—l)/a—2me 2 /h 2 ]. (78) 

We saw in the preceding section that only those eigenfunctions y 
are allowed that tend to zero with r and hence, from (74), /(r) must 
tend to zero with r. The series (77) must therefore terminate on the 
side of small s and the minimum value of s must be greater than zero. 
Now the only possible minimum values of s are those that make the 
coefficient of c s in (78) vanish, i.e. n+1 and —n } and the second 
of these is negative or zero. Thus the minimum value of s must be 
n+ 1. Since n is always an integer, the values of s will all be integers. 

f The e here, denoting minus the charge on an electron, is, of course, to be dis¬ 
tinguished from the e denoting the base of exponentials, 
t Sehrodinger, Ann , d. Physik, 79 (1926), 361. 
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The series (77) will in general extend to infinity on the side of large s. 
For large values of s the ratio of successive terms is 

c s _2 r 

~ sa 

according to (78). Thus the series (77) will always converge, as the 
ratios of the higher terms to one another are the same as for the 



which converges to e 2r!a . 

We must now examine how our solution xo behaves for large 
values of r. We must distinguish between the two cases of H' positive 
and H f negative. For H' negative, a given by (75) will be real. Sup¬ 
pose we take the positive value for a. Then as r go the sum of the 
series (77) will tend to infinity according to the same law as the sum 
of the series (79), i.e. the la^w e 2r K Thus, from (74), xo will tend to 
infinity according to the law e rla and will not represent a physically 
possible state. There is therefore in general no permissible solution 
of (73) for negative values of H'. An exception arises, however, when¬ 
ever the series (77) terminates on the side of large s, in which case the 
boundary conditions are all satisfied. The condition for this termina¬ 
tion of the series is that the coefficient of c s _ ± in (78) shall vanish for 
some value of the suffix 5—1 not less than its minimum value n+ 1, 
which is the same as the condition that 


s me 2 

aT'W 


for some integer s not less than n- f 1. 
condition becomes 



With the help of (75) this 
(80) 


and is thus a condition for the energy-level H'. Since 5 may be any 
positive integer, the formula (80) gives a discrete set of negative 
energy-levels for the hydrogen atom. These are in agreement with 
experiment. For each of them (except the lowest one 5 = 1 ) there 
are several independent states, as there are various possible values 
for n, namely any positive or zero integer less than 5 . This multi¬ 
plicity of states belonging to an energy-level is in addition to that 
mentioned in the preceding section arising from the various possible 



ENERGY-LEVELS OF THE HYDROGEN ATOM 


159 


t 39 

values for a component of angular momentum, which latter multi¬ 
plicity occurs with any central field of force. The n multiplicity occurs 
only with an inverse square law of force and even then is removed 
when one takes relativistic mechanics into account, as will be found 
in Chapter XI. The solution xq of (73) when H' satisfies (80) tends to 
zero exponentially as r -» co and thus represents a closed state (corre¬ 
sponding to an elliptic orbit in Bohr’s theory). 

For any positive values of H',a given by (75) will be pure imaginary. 
The series (77), which is like the series (79) for large r, will now have a 
sum that remains finite as r -> oo. Thus xq given by (74) will now remain 
finite as r->oo and will therefore be a permissible solution of (73), 
giving a wave function ifj that tends to zero according to the law r~ x as 
r -> oo. Hence in addition to the discrete set of negative energy-levels 

(80), all positive energy-levels are allowed. The states of positive 

00 

energy are not closed, since for them the integral to infinity J |^ 0 | 2 dr 
does not converge. (These states correspond to the hyperbolic orbits 
of Bohr’s theory.) 

40. Selection rules 

If a dynamical system is set up in a certain stationary state, it will 
remain in that stationary state so long as it is not acted upon by 
outside forces. Any atomic system in practice, however, frequently 
gets acted upon by external electromagnetic fields, under whose 
influence it is liable to cease to be in one stationary state and to make 
a transition to another. The theory of such transitions will be de¬ 
veloped in §§ 44 and 45. A result of this theory is that, to a high degree 
of accuracy, transitions between two states cannot occur under the 
influence of electromagnetic radiation if, in a Heisenberg representa¬ 
tion with these two stationary states as two of the basic states, the 
matrix element, referring to these two states, of the representative 
of the total electric displacement D of the system vanishes. Now it 
happens for many atomic systems that the great majority of the 
matrix elements of D in a Heisenberg representation do vanish, and 
hence there are severe limitations on the possibilities for transitions. 
The rules that express these limitations are called selection rules . 

The idea of selection rules can be refined by a more detailed 
application of the theory of §§ 44 and 45, according to which 
the matrix elements of the different Cartesian components of the 
vector D are associated with different states of polarization of the 
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electromagnetic radiation. The nature of this association is just what 
one would get if one considered the matrix elements, or rather their 
real parts, as the amplitudes of harmonic oscillators which interact 
with the field of radiation according to classical electrodynamics. 

There is a general method for obtaining all selection rules, as 
follows. Let us call the constants of the motion which are diagonal in 
the Heisenberg representation a 5 s and let D be one of the Cartesian 
components of D. We must obtain an algebraic equation connecting 
D and the as which does not involve any dynamical variables other 
than D and the a’s and which is linear in D. Such an equation will 
be of the form 2f r Dg r = 0, (81) 

where the //s and g^s are functions of the a’s only. If this equation 
is expressed in terms of representatives, it gives us 

2f r («')W\D W ,s >g r {«) = o, 

<oc f \D\a"y 2f r (oc')g r («) = o, 

which shows that <V|D|a"> = 0 unless 

2fr(«)9r(.«) = 0. (82) 

This last equation, giving the connexion which must exist between 
a and ol in order that <a'|D|a"> may not vanish, constitutes the 
selection rule, so far as the component D of D is concerned. 

Our work on the harmonic oscillator in § 34 provides an example 
of a selection rule. Equation (8) is of the form (81) with rj for D and 
H playing the part of the a’s, and it shows that the matrix elements 
(H'lTjlH"} of i] all vanish except those for which H"—H' = Hw. The 
conjugate complex of this result is that the matrix elements (H f [ rj |£T> 
of rj all vanish except those for which H”—H' = — fia). Since q is a 
numerical multiple of r\—yj, its matrix elements <H , \q\H"') all vanish 
except those for which W-W = If the harmonic oscillator 

carries an electric charge, its electric displacement D will be pro- 
portional to q. The selection rule is then that only those transitions 
can take place in which the energy H changes by a single quan¬ 
tum hw. 

We shall now obtain the selection rules for m 2 and k for an electron 
moving in a central field of force. The components, of electric dis- 
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placement are here proportional to the Cartesian coordinates x, y , z. 
Taking first m z , we have that m z commutes with 2 , or that 


m z z—zm z = 0. 

This is an equation of the required type (81), giving us the selection 
ru * e m z —m' z = 0 

for the z-component of the displacement. Again, from equations 
[m s , [m z , x]] = [m z , y ] = —* 


(23) we have 


or m\x~~2m z xm z -f- xm 2 ~~ 

which is also of the type (81) and gives us the selection rule 
m z 2 —2m z m / z +m' z 2 --h 2 = 0 
or (m! z —m' z — S)(m' — = 0 


for the ^-component of the displacement. The selection rule for the 
^-component is the same. Thus our selection rules for m z are that 
in transitions associated with radiation with a polarization corresponding 
to an electric dipole in the zdirection, m z cannot change , while in transi¬ 
tions associated with a polarization corresponding to an electric dipole 
in the x-direction or y-direction, m z must change by 

We can determine more accurately the state of polarization of the 
radiation associated with a transition in which m' changes by jzfi, by 
considering the condition for the non-vanishing of matrix elements 
of x-\-iy and x—iy. We have 

[m z) x+iy] = y—ix = —i(x+iy) 
or m z (x+iy)—(x+iy)(m 2 +h) = 0, 


which is again of the type (81). It gives 

m' z —m" z —h = 0 

as the condition that <m' \x-{-iy\m' z } shall not vanish. Similarly, 

m z ~w! r z +h = 0 

is the condition that (m z \x—iy\m! r z ) shall not vanish. Hence 
<rri z \x—iy\m' z — £> = 0 

or <ja z \x\m z --hy = i<m z lylm z —h) = (a+ib)e toii 

say, a, 6, and 10 being real. The conjugate complex of this is 
<m z -~h\x\m' z } = -i(m z -~ft\y\m z > = {a—ib)e~ ioi K 
Thus the vector ^{^m z \D\m z --h)+(m z --h\D\m z ')} > which determines 

3505.57 M 
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tie state of polarization of the radiation associated with transitions 
for which m" = m' s —h, has the following three components 

hy+<m’ z — &|z|m;>} 

= %{(a+ib)e iwl + {a—ib)e- iu>i } — acoswt—bsinwt, 

(83 ) 

= %i{~-(a+ib)e ia>i +{a—ib)e~ iioC } = a sin cot+b cos at, 
%{<m' z \z\m z — *>+<mi—#| 2 |wi>} = 0. 

From the form of these components we see that the associated radia¬ 
tion moving in the z-direction will he circularly polarized, that 
moving in any direction in the ay-plane will be linearly polarized in 
this plane, and that moving in intermediate directions will be 
elliptieally polarized. The direction of circular polarization for radia¬ 
tion moving in the z-direction will depend on whether co is positive 
or negative, and this will depend on which of the two states m z or 
ml = m z —H has the greater energy. 

We shall now determine the selection rule for k. We have 

[k(k+fi),z] = [ml ,z]+[m2,z] 

= — ym x —m x y+xm y +m y x 
= 2 {m y x—m x y+iftz) 

= 2(m y x—ym x ) = 2{xm y —m x y). 

Similarly, [k(k+h),x] = 2 (ym z —m y z) 

and [&(*+*), y] = 2{m x z—xm z ). 

Hence 

[k(k+n) } [k(k+h) 3 z]] 

= 2[k(k+%),m y x—m x y+iftz] 

= 2m y [k(k+fi,),x]—2m x [k(k+ft),y]+2ifi[k(k+fi) } z] 

— 4:m y (ym z ~m y z)—4:m x (m x z--xm z ) J r2{k(k J r}i)z~-zk(k J rfi)} 

= 4(m^a+m y y+m s z)m s —4(m|+m2+m2)z+ 

+ 2{&(&+&)z—z&(&+&)}. 
From (22) m x x+m y y J r m 2 z — 0 (84) 

and hence 

[*(*+«), [Jfe(Jfe+*) 9 *3] = -2{&(fc+S)z+zfc(^+% 
which gives 

k 2 (k-rh) 2 z~2k(k+h)zk(k+h)~\~zk 2 (k+h) 2 — 

-2ft 2 {k(k+1i)z-{-zk(k-\-fi)} = 0. (85) 
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Similar equations hold for x and y. These equations are of the re¬ 
quired type (81), and give us the selection rule 
fc'2(£' + ft )2_ 2 &'(F+^ 

-2 m\v+n)—2mr(w+K) = o, 

which reduces to 

(fc'-f = 0. 

A transition can take place between two states k' and V only if one 
of these four factors vanishes. 

Now the first of the factors, (fc'+&"+2&), can never vanish, since 
the eigenvalues of k are all positive or zero. The second, (£'-)-&"), can 
vanish only if k' = 0 and k” = 0. But transitions between two states 
with these values for k cannot occur on account of other selection 
rules, as may be seen from the following argument. If two states 
(labelled respectively with a single prime and a double prime) are 
such that k' = 0 and k" = 0, then from (41) and the corresponding 
results for m x and m y , m' x — m y = m z — 0 and m" = m" = m" = 0. 
The selection rule for m z now shows that the matrix elements of 
x and y referring to the two states must vanish, as the value of m 2 
does not change during the transition, and the similar selection rule 
for m x or m y shows that the matrix element of z also vanishes. Thus 
transitions between the two states cannot occur. Our selection rule 
for k now reduces to 

(k'-k"+ti)(k'-k"-ri) = 0 , 

showing that k must change by This selection rule may be written 
k ,2 ~-2k r k"+k" 2 -h 2 = 0, 

and since this is the condition that a matrix element <Jc'\z\k"') shall 
not vanish, we get the equation 

k 2 z—2kzk+zk lz —fi 2 z = 0 

or [k,[k,z]]=—z, (86) 

a result which could not easily be obtained in a more direct way. 

As a final example we shall obtain the selection rule for the magni¬ 
tude K of the total angular momentum M of a general atomic system. 
Let x,y,z be the coordinates of one of the electrons. We must obtain 
the condition that the (K' } K") matrix element of x, y , or z shall not 
vanish. This is evidently the same as the condition that the (K\ K ") 
matrix element of X v A 2 , or A 3 shall not vanish, where A x , A 2 , and A s 
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are any three independent linear functions of x, y , and z with numeri¬ 
cal coefficients, or more generally with any coefficients that commute 
with K and are thus represented by matrices which are diagonal with 
respect to K. Let A 0 = M x x+M y y+M z z, 

X x = M y z—M z y—vhx, 

\ = M z x—M x z—itiy, 

X z = M x y—M y x—itiz. 

We have 

M x \ x +M y X v +M z X z = 2 (M x M y z—M x M z y—iKM x x) 

xyz 

= 2 ( M xM v -M y M x -im z )z = 0 (87) 

xyz 

from (29). Thus A^, X y , and X 2 are not linearly independent functions 
of x } y } and z. Any two of them, however, together with A 0 are three 
linearly independent functions of x,y, and 2 and may be taken as the 
above A l3 A 2 , A 3 , since the coefficients M x , M y , M z all commute with K . 
Our problem thus reduces to finding the condition that the ( K',K ") 
matrix elements of A 0 , A^., X y , and X z shall not vanish. The physical 
meanings of these A’s are that A 0 is proportional to the component of 
the vector ( x,y,z ) in the direction of the vector M, and X x , X y , X z are 
proportional to the Cartesian components of the component of ( x , y, z) 
perpendicular to M. 

Since A 0 is a scalar it must commute with K. It follows that only 
the diagonal elements (K'\X 0 \K'y of A 0 can differ from zero, so the 
selection rule is that K cannot change so far as A 0 is concerned. Apply¬ 
ing (30) to the vector X x ,X y ,X z , we have 

[34 AJ = Xy, [34 Aj,] = —X x , [31, A s ] = 0. 

These relations between and A x , X y , A s are of exactly the same form 
as the relations (23), (24) between m s and x,y,z, and also (87) is of 
the same form as (84). The dynamical variables A x , X y , X z thus have the 
same properties relative to the angular momentum M as x, y, z have 
relative to m. The deduction of the selection rule for k when the 
electric displacement is proportional to (x, y, z) can therefore be taken 
over and applied to the selection rule for K when the electric displace¬ 
ment is proportional to (X x , Xy, A,). We find in this way that, so far as 
\ are concerned, the selection rule for K is that it must change 
by 

Collecting results, we have as the selection rule for K that it must 
change by 0 or We have considered the electric displacement 
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produced by only one of the electrons, but the same selection rule 
must hold for each electron and thus also for the total electric dis¬ 
placement. 


41. The Zeeman effect for the hydrogen atom 
We shall now consider the system of a hydrogen atom in a uniform 
magnetic field. The Hamiltonian (57) with V = —e 2 /r, which describes 
the hydrogen atom in no external field, gets modified by the magnetic 
field, the modification, according to classical mechanics, consisting 
in the replacement of the components of momentum, p x , p yi p z , by 
p x +e/c,A x , p y +elc.A y> p z +e/c.A z , where A x , A yi A z are the com¬ 
ponents of the vector potential describing the field. For a uniform 
field of magnitude Jt in the direction of the 2 -axis we may take 
A x = A y — \J=fx, A z = 0. The classical Hamiltonian will 

then be 


\\ J *y)*+(pv+ll'# x ) t 


This classical Hamiltonian may be taken over into the quantum 
theory if we add on to it a term giving the effect of the spin of the 
electron. According to experimental evidence and according to the 
theory of Chapter XI, the electron has a magnetic moment — e$/2 mo . a, 
where a is the spin vector of § 37. The energy of this magnetic moment 
in the magnetic field will be eW/2mc.a r Thus the total quantum 
Hamiltonian will be 


1 e \ 2 / 1 e u V , e 2 ehAf- 

r --Vc My + *»+!;•** +* ; '7+2S5' 


There ought strictly to be other terms in this Hamiltonian giving the 
interaction of the magnetic moment of the electron with the electric 
field of the nucleus of the atom, but this effect is small, of the same 
order of magnitude as the correction one gets by taking relativistic 
mechanics into account, and will be neglected here. It will be taken 
into account in the relativistic theory of the electron given in 
Chapter XI. 

If the magnetic field is not too large, we can neglect terms involving 
c# 2 , so that the Hamiltonian (88) reduces to 



(pl+Pl 


e 2 

r *~2mc 


(*p»—«?»)+ 


ehJ¥ 
2 me 


c 



( 89 ) 
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The extra terms due to the magnetic field are now e*#/2rac. (m z ~\-ft<j z ). 
But these extra terms commute with the total Hamiltonian and are 
thus constants of the motion. This makes the problem very easy. 
The stationary states of the system, i.e. the eigenstates of the Hamil¬ 
tonian (89), will be those eigenstates of the Hamiltonian for no field 
that are simultaneously eigenstates of the observables m z and a z , or 
at least of the one observable m z -\-fto z , and the energy-levels of the 
system will be those for the system with no field, given by (80) if 
one considers only closed states, increased by an eigenvalue of 
eJ?l2mc.{m z +h(j z ). Thus stationary states of the system with no 
field for which 'm z has the numerical value m', an integral multiple 
of ft, and for which also a z has the numerical value o z = ±1, will still 
be stationary states when the field is applied. Their energy will be 
increased by an amount consisting of the sum of two parts, a part 
eJ=fj2mG.m z arising from the orbital motion, which part may be con¬ 
sidered as due to an orbital magnetic moment —emJ2mc, and a part 
eJ^j2mc.ha z arising from the spin. The ratio of the orbital magnetic 
moment to the orbital angular momentum m' is — e/2mc, which is 
half the ratio of the spin magnetic moment to the spin angular 
momentum. This fact is sometimes referred to as the magnetic 
anomaly of the spin. 

Since the energy-levels now involve m z , the selection rule for m z 
obtained in the preceding section becomes capable of direct com¬ 
parison with experiment. We take a Heisenberg representation in 
which, among other constants of the motion, m z and a z are diagonal. 
The selection rule for m z now requires m z to change by ft, 0 , or —ft, 
wiiile a z , since it commutes with the electric displacement, will not 
change at all. Thus the energy difference between the two states 
taking part in the transition process will differ by an amount 
eftMj2mc, 0, or —eftj^/2mc from its value for no magnetic field. 
Hence, from Bohr’s frequency condition, the frequency of the 
associated electromagnetic radiation will differ by eJ^fairmc, 0, or 
— eJtj&rmc from that for no magnetic field. This means that each 
spectral line for no magnetic field gets split up by the field into three 
components. If one considers radiation moving in the ^-direction, 
then from (83) the two outer components will be circularly polarized, 
while the central undisplaced one will be of zero intensity. These 
results are in agreement with experiment and also with the classical 
theory of the Zeeman effect. 
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42. General remarks 

In the preceding chapter exact treatments were given of some simple 
dynamical systems in the quantum theory. Most quantum problems, 
however, cannot be solved exactly with the present resources of 
mathematics, as they lead to equations whose solutions cannot be 
expressed in finite terms with the help of the ordinary functions of 
analysis. For such problems one can often use a perturbation method. 
This consists in splitting up the Hamiltonian into two parts, one of 
which must be simple and the other small. The first part may then 
be considered as the Hamiltonian of a simplified or unperturbed 
system, which can be dealt with exactly, and the addition of the 
second will then require small corrections, of the nature of a perturba¬ 
tion, in the solution for the unperturbed system. The requirement 
that the first part shall be simple requires in practice that it shall not 
involve the time explicitly. If the second part contains a small 
numerical factor e, we can obtain the solution of our equations for 
the perturbed system in the form of a power series in c, which, pro¬ 
vided it converges, will give the answer to pur problem with any 
desired accuracy. Even when the series does not converge, the first 
approximation obtained by means of it is usually fairly accurate. 

There are two distinct methods in perturbation theory. In one of 
these the perturbation is considered as causing a modification of the 
states of motion of the unperturbed system. In the other we do not 
consider any modification to be made in the states of the unperturbed 
system, but we suppose that the perturbed system, instead of remain¬ 
ing permanently in one of these states,- is continually changing from 
one to another, or making transitions , under the influence of the 
perturbation. Which method is to be used in any particular case 
depends on the nature of the problem to be solved. The first method 
is useful usually only when the perturbing energy (the correction in the 
Hamiltonian for the undisturbed system) does not involve the time 
explicitly, and is then applied to the stationary states. It can be used 
for calculating things that do not refer to any definite time, such as 
the energy-levels of the stationary states of the perturbed system, or, 
in the case of collision problems, the probability of scattering through 
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a given angle. The second method must, on the other hand, be used 
for solving all problems involving a consideration of time, such as 
those about the transient phenomena that occur when the perturba¬ 
tion is suddenly applied, or more generally problems in which the 
perturbation varies with the time in any way (i.e. in which the per¬ 
turbing energy involves the time explicitly). Again, this second 
method must be used in collision problems, even though the per¬ 
turbing energy does not here involve the time explicitly, if one 
wishes to calculate absorption and emission probabilities, since these 
probabilities, unlike a scattering probability, cannot be defined with¬ 
out reference to a state of affairs that varies with the time. 

One can summarize the distinctive features of the two methods by 
saying that, -with the first method, one compares the stationary states 
of the perturbed system with those of the unperturbed system; with 
the second method one takes a stationary state of the unperturbed 
system and sees how it varies with time under the influence of the 
perturbation. 

43. The change in the energy-levels caused by a perturbation 

The first of the above-mentioned methods will now be applied to 
the calculation of the changes in the energy-levels of a system caused 
by a perturbation. We assume the perturbing energy, like the Hamil¬ 
tonian for the unperturbed system, not to involve the time explicitly. 
Our problem has a meaning, of course, only provided the energy-levels 
of the unperturbed system are discrete and the differences between 
them are large compared with the changes in them caused by the 
perturbation. This circumstance results in the treatment of perturba¬ 
tion problems by the first method having some different features 
according to whether the energy-levels of the unperturbed system are 
discrete or continuous. 

Let the Hamiltonian of the perturbed system be 

S = E+V, (1) 

E being the Hamiltonian of the unperturbed system and V the small 
perturbing energy. By hypothesis each eigenvalue H' of H lies very 
close to one and only one eigenvalue E f of E, We shall use the same 
number of primes to specify any eigenvalue of H and the eigenvalue 
of E to which it lies very close. Thus we shall have H" differing from 
E tr by a small quantity of order V and differing from 1W by a quantity 
that is not small unless E' = E". We must now take care always to 



§43 


CHANGE IN THE ENERGY-LEVELS 


169 


use different numbers of primes to specify eigenvalues of H and E 
which we do not want to lie very close together. 

To obtain the eigenvalues of H, we have to solve the equation 

H\H'y = H'\H'y 

or (H'-E)\H'> = V\H'y. (2) 

Let |0> be an eigenket of E belonging to the eigenvalue W and 
suppose the \H’y and H' that satisfy (2) to differ from |0> and E* 
only by small quantities and to be expressed as 

\H'y = | 0 > + |l>+|2>-f-..., 

H' — 

where 11> and a x are of the first order of smallness (i.e. the same order 
as F), 12) and a 2 are of the second order, and so on. Substituting 
these expressions in (2), we obtain 

{E'— *JS , +a 1 +a 2 +—KI0)+|1)+|2>+...} = F{|0) + |1)+...}. 

If we now separate the terms of zero order, of the first order, of the 
second order, and so on, we get the following set of equations, 

(E'—E)\oy = 0, 

(E'-E)\iy+ ai \oy = F|o>, (4) 

(E' ~ E)\2y+a 1 \iy-{-a 2 \Qy = F|l>, 

The first of these equations tells us, what we have already assumed, 
that |0> is an eigenket of E belonging to the eigenvalue E'. The others 
enable us to calculate the various corrections |1>, |2>,.,., a v & 2 ,.... 

For the further discussion of these equations it is convenient to 
introduce a representation in which E is diagonal, i.e. a Heisenberg 
representation for the unperturbed system, and to take E itself as 
one of the observables whose eigenvalues label the representatives. 
Let the others, in the event of others being necessary, as is the case 
when there is more than one eigenstate of E belonging to any eigen¬ 
value, be called /?’s. A basic bra is then (E"fS" |. Since |Q) is an 
eigenket of E belonging to the eigenvalue E', we have 

<^T|0> ^ § E'E'ttn (5) 

where /(/?") is some function of the variables j3\ With the help of this 
result the second of equations (4), written in terms of representatives, 
becomes 

(w~E"KE'r\i>+ai s^E-fin = 2 <^ww'>/o n w 

/S' 
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Putting E” = E' here, we get 

aim = i<w'\v\w>f(n ( 7 ) 

Equation (7) is of the form of the standard equation in the theory 
of eigenvalues, so far as the variables j8' are concerned. It shows that 
the various possible values for a x are the eigenvalues of the matrix 
<2?'^|F|jE'j8'>. This matrix is a part of the representative of the 
perturbing energy in the Heisenberg representation for the unper¬ 
turbed system, namely, the part consisting of those elements that 
refer to the same unperturbed energy-level E' for their row and 
column. Each of these values for % gives, to the first order, an energy- 
level of the perturbed system lying close to the energy-level E' of the 
unperturbed system.f There may thus be several energy-levels of the 
perturbed system lying close to the one energy-level E' of the unper¬ 
turbed system, their number being anything not exceeding the 
number of independent states of the unperturbed system belonging 
to the energy-level E\ In this way the perturbation may cause a 
separation or partial separation of the energy-levels that coincide 
at E f for the unperturbed system. 

Equation (7) also determines, to the zero order, the representatives 
<i?7r|0> of the stationary states of the perturbed system belonging 
to energy-levels lying close to E\ any solution of (7) substituted 
in (5) giving one such representative. Each of these stationary states 
of the perturbed system approximates to one of the stationary states 
of the unperturbed system, but the converse, that each stationary 
state of the unperturbed system approximates to one of the stationary 
states of the perturbed system, is not true, since the general 
stationary state of the unperturbed system belonging to the energy- 
level E' is represented by the right-hand side of (5) with an arbitrary 
function/^'). The problem of finding which stationary states of 
the unperturbed system approximate to stationary states of the 
perturbed system, i.e. the problem of finding the solutions /(/?') of 
(7), corresponds to the problem of ‘secular perturbations’ in classical 
mechanics. It should be noted that the above results are indepen¬ 
dent of the values of all those matrix elements of the perturbing 

t To distinguish these energy-levels one from another we should require some 
more elaborate notation, since according to the present notation they must all be 
specified by the same number of primes, namely by the number of primes specifying 
the energy-level of the unperturbed system from which they arise. For our present 
purposes, however, this more elaborate notation is not required. 
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energy which refer to two different energy-levels of the unperturbed 
system. 

Let us see what the above results become in the specially simple case 
when there is only one stationary state of the unperturbed system 
belonging to each energy-level, f In this case E alone fixes the repre¬ 
sentation, no /J’s being required. The sum in (7) now reduces to a 
single term and we get 

ai = (E'\V\E'>. ( 8 ) 

There is only one energy-level of the perturbed system lying close to 
any energy-level of the unperturbed system and the change in energy 
is equal , in the first order , to the corresponding diagonal element of the 
perturbing energy in the Heisenberg representation for the unperturbed 
system , or to the average value of the perturbing energy for the correspond¬ 
ing unperturbed state. The latter formulation of the result is the same 
as in classical mechanics when the unperturbed system is multiply 
periodic. 

We shall proceed to calculate the second-order correction a 2 in 
the energy-level for the case when the unperturbed system is non¬ 
degenerate. Equation (5) for this case reads 

<2S"|0> = 


with neglect of an unimportant numerical factor, and equation ( 6 ) 
reads {E'-E")(E"\l)+a^ E . E . = (E"\V\E'y. 


This gives us the value of <£"| 1 > when E" 7 = E', namely 
, P |,v _ (E"\V\E'y 

\E \ / jjj/ _ jpjf 


(9) 


The third of equations (4), written in terms of representatives, 
becomes 

(E' — E")(E" 12> -\-a x (E "11 >+a 2 = 2 <E”\V\E w XE m \l>. 
Putting E" — E' here, we get 

a x (E' |l>+a 2 - 2 (E , \V\E'"')(E"'\l} ) 

E'" 

which reduces, with the help of .( 8 ), to 

2 (E'\V\E"XE"\l>- 

W*E' 


t A system with only one stationary state belonging to each energy-level is often 
called non-degenerate and one with two or more stationary states belonging to an 
energy-level is called degenerate, although these words are not very appropriate from 
the modern point of view. 
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Substituting for <F"|1> from (9), we obtain finally 

x- (E'\V\E"y(E"\V\E'y 

Z e'-e* 

E'¥>E' 

giving for the total energy change to the second order 


( 10 ) 


E'^E' 


The method may be developed for the calculation of the higher 
approximations if required. General recurrence formulas giving the 
Tith order corrections in terms of those of lower order have been 
obtained by Born, Heisenberg, and Jordan.*)* 


44. The perturbation considered as causing transitions 

We shall now consider the second of the two perturbation methods 
mentioned in § 42. We suppose again that we have an unperturbed 
system governed by a Hamiltonian E which does not involve the 
time explicitly, and a perturbing energy V which can now be an 
arbitrary function of the time. The Hamiltonian for the perturbed 
system is again H = E+V. For the present method it does not 
make any essential difference w r hether the energy-levels of the 
unperturbed system, i.e. the eigenvalues of E , form a discrete or 
continuous set. We shall, however, take the discrete case, for 
definiteness. We shall again work with a Heisenberg representation 
for the unperturbed system, but as there will now be no advantage in 
taking E itself as one of the observables whose eigenvalues label the 
representatives, we shall suppose we have a general set of a’s to label 
the representatives. 

Let us suppose that at the initial time t 0 the system is in a state for 
which the a’s certainly have the values a'. The ket corresponding to 
this state is the basic ket |a'>. If there w^ere no perturbation, i.e. if the 
Hamiltonian were E } this state would be stationary. The perturba¬ 
tion causes the state to change. At time t the ket corresponding to the 
state in Sehrodinger’s picture will be T |a'>, according to equation (1) 
of § 27. The probability of the a’s then having the values a" is 

P(pLQL r ) = |<a ,/ |3 r7 |a / >| 2 . (11) 

For a" # a', P(a'a") is the probability of a transition taking place 
from state a' to state a " during the time interval t 0 t, while P(aV) 

t Z.f. Physih , 35 (1925), 565. 
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is the probability of no transition taking place at all. The sum of 
JP(aV') for all oi" is, of course, unity. 

Let us now suppose that initially the system, instead of being 
certainly in the state ol, is in one or other of various states a' with 
the probability P a , for each. The Gibbs density corresponding to this 
distribution is, according to (68) of § 33 

P = 1 II* (12) 

a' 

At time t, each ket |a'> will have changed to T |a'> and each bra <a'| 
to <a'|P, so p will have changed to 

Pt = ?.TW>P a <ot'\T. (13) 

The probability of the a’s thea having the values a will be, from 
(73) of § 33, <0 l'\ p yy = £ < a "|T| a ')P a ,< a '|T| a "> 

= 2P„P(aV) (14) 

a' 

with the help of (11). This result expresses that the probability of 
the system being in the state a" at time t is the sum of the probabilities 
of the system being initially in any state a' a", and making a transi¬ 
tion from state ol to state ol and the probability of its being initially 
in the state-a" and making no transition. Thus the various transition 
probabilities act independently of one another, according to the 
ordinary laws of probability. 

The whole problem of calculating transitions thus reduces to the 
determination of the probability amplitudes <a"| P|a'>. These can be 
worked out from the differential equation for T, equation (6) of § 27, or 


ihdT/dt = HT = (E+V)T. (15) 

The calculation can be simplified by working with 

rp* e iE(i-tv)!hp' ( 10 ) 

We have iHdT^jdt = e im ~ t ^ Ti {~ET J riTidT/dt) 

— e mt-t 0 wyp __ y*T*, (17) 

where F* = (18) 


i.e. F* is the result of applying a certain unitary transformation to F. 
Equation (17) is of a more convenient form than (15), because (17) 
makes the change in T 7 * depend entirely on the perturbation F, and 
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for V = 0 it would make 5P* equal its initial value, namely unity. 
We have from (16) 

W\T*\ol'> = e iE ^\a , '\T\oc , } ) 

so that P(«V) = (19) 

showing that T* and T are equally good for determining transition 
probabilities. 

Our work up to the present has been exact. We now assume V is 
a small quantity of the first order and express T * in the form 

T* = l + T*+y* + ..., (20) 

where T* is of the first order, T% is of the second, and so on. Substi¬ 
tuting (20) into (17) and equating terms of equal order, we get 

ihdTfjdt = F*, 

MdT*ldt=Y*T*, | (21) 


From the first of these equations we obtain 

t 

T* = -m - 1 1 V*(f) dt', (22) 

U 

from the second we obtain 


t t' 

T* = -K-* J V*(t’) dt' J V*{f) dt", 

to to 


(23) 


and so on. For many practical problems it is sufficiently accurate to 
retain only the term T*, which gives for the transition probability 
P(aoc") with a ^ a' 


P(aV') = fi- <a"| J V*(t') dt' |a'> 


it 

f W\V*(t')\x’>dt’ 


12 


(24) 


We obtain in this way the transition probability to the second order 
of accuracy. The result depends only on the matrix element 
<a"|F*(i')ja:'> of V*(t') referring to the two states concerned, with t' 
going from t 0 to t. Since F* is real, like V, 

O ff |F*(F)|a'> = <^F*(F)!eO 

and hence P(aV') = P(otV) (25) 

to the second order of accuracy. 
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Sometimes one is interested in a transition a a such that the 
matrix element <(a"|F*|a'> vanishes, or is small compared with other 
matrix elements of F*. It is then necessary to work to a higher 
accuracy. If we retain only the terms Tf and Tf, we get, for ot" ^ a', 

I t 

J <a"|F*(i')|a'> dt'— 

-m- 1 2 f W\V*(t')\<x'">dt' f (oc m \V*(t")\cc')dt" (26) 

a'Ya'.a' J J 

to to 

The terms ot" = a' and a" = a" are omitted from the sum since they 
are small compared with other terms of the sum, on account of the 
smallness of <a"|F*|a'>- To interpret the result (26), we may suppose 
that the term t 

f <a"|7*(i')|a'> dt' (27) 

U 

gives rise to a transition directly from state a to state a , while the 
term t v 

-in - 1 J <a"|F*(OI<*'"> dt | O"'|F*( 0 K> dt" (28) 

to to 

gives rise to a transition from state ot to state a'", followed by a 
transition from state ot" to state a". The state ot" is called an inter¬ 
mediate state in this interpretation. We must add the term (27) to the 
various terms (28) corresponding to different intermediate states 
and then take the square of the modulus of the sum, which means 
that there is interference between the different transition processes— 
the direct one and those involving intermediate states—and one can¬ 
not give a meaning to the probability for one of these processes by 
itself. For each of these processes, however, there is a probability 
amplitude. If one carries out the perturbation method to a higher 
degree of accuracy, one obtains a result which can be interpreted 
similarly, with the help of more complicated transition processes 
involving a succession of intermediate states. 

45. Application to radiation 

. In the preceding section a general theory of the perturbation of an 
atomic system was developed, in which the perturbing energy could 
vary with the time in an arbitrary way. A perturbation of this 
kind can be realized in practice by allowing incident electromagnetic 
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radiation to fall on the system. Let us see what our result (24) reduces 
to in this case. 

If we neglect the effects of the magnetic field of the incident radia¬ 
tion ^ and if we further assume that the wave-lengths of the harmonic 
components of this radiation are all large compared with the dimen¬ 
sions of the atomic system, then the perturbing energy is simply the 
scalar product y = ^ (29) 

where D is the total electric displacement of the system and £ is 
the electric force of the incident radiation. We suppose £ to be a 
given function of the time. If we take for simplicity the case when 
the incident radiation is plane polarized with its electric vector in 
a certain direction and let D denote the Cartesian component of D 
in this direction, the expression (29) for V reduces to the ordinary 
product y = 

where 8 is the magnitude of the vector £. The matrix elements of 
Fare <cf\V\of> = <a"\D\a'}€, 

since 8 is a number. The matrix element <a"|D|a'> is independent 
of£. From (18) 

<a''|F*(*)|a'> = 

and hence the expression (24) for the transition probability becomes 

t 

P(aV) = £~ 2 |<c/|I>|a'>| 2 | eW-W^eit’) dt' (30) 


If the incident radiation during the time interval t 0 to t is resolved 
into its Fourier components, the energy crossing unit area per unit 
frequency range about the frequency v will be, according to classical 
electrodynamics, f t 



Comparing this with (30), we obtain 

P(a' of) = 27TC” 1 ^“ 2 1 <a" \D | a'> | 2 jE 7 v , (32) 

where v=\W—E'\IK. (33) 

From this result we see in the first place that the transition proba¬ 
bility depends only on that Fourier component of the incident radia¬ 
tion whose frequency v is connected with the change of energy by (33). 
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This gives us Bohr's Frequency Condition and shows how the ideas 
of Bohr’s atomic theory, which was the forerunner of quantum 
mechanics, can be fitted in with quantum mechanics. 

The present elementary theory does not tell us anything about the 
energy of the field of radiation. It would be reasonable to assume, 
though, that the energy absorbed or liberated by the atomic system 
in the transition process comes from or goes into the component of 
the radiation with frequency v given by (33). This assumption will 
be justified by the more complete theory of radiation given in 
Chapter X. The result (32) is then to be interpreted as the proba¬ 
bility of the system, if initially in the state of lower energy, absorb¬ 
ing radiation and being carried to the upper state, and if initially in 
the upper state, being stimulated by the incident radiation to emit 
and fall to the lower state. The present theory does not account for 
the experimental fact that the system, if in the upper state with no 
incident radiation, can emit spontaneously and fall to the lower state, 
but this also will be accounted for by the more complete theory of 
Chapter X. 

The existence of the phenomenon of stimulated emission was in¬ 
ferred by Einstein,f long before the discovery of quantum mechanics, 
from a consideration of statistical equilibrium between atoms and a 
field of black-body radiation satisfying Planck’s law. Einstein showed 
that the transition probability for stimulated emission must equal 
that for absorption between the same pair of states, in agreement 
with the present quantum theory, and deduced also a relation con¬ 
necting this transition probability with that for spontaneous emission, 
which relation is in agreement with the theory of Chapter X. 

The matrix element <a"|D|a'> in (32) plays the part of the ampli¬ 
tude of one of the Fourier components of D in the classical theory of 
a multiply-periodic system interacting with radiation. In fact it was 
the idea of replacing classical Fourier components by matrix elements 
which led Heisenberg to the discovery of quantum mechanics in 1925. 
Heisenberg assumed that the formulas describing the interaction with 
radiation of a system in the quantum theory can be obtained from 
the classical formulas by substituting for the Fourier components of 
the total electric displacement of the system the corresponding matrix 
elements. According to this assumption applied to spontaneous emis¬ 
sion, a system having an electric moment D will, when in the state 

f Einstein, Phys. Zeits. 18 (1917), 121. 

N 


3595.57 
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ol, spontaneously emit radiation of frequency v = (E'—E")/h, where 
E” is an energy-level, less than E\ of some state a", at the rate 

4 ( 27rv)4 |< a "|D j a '>| 2 . (34) 

The distribution of this radiation over the different directions of 
emission and its state of polarization for each direction will be the 
same as that for a classical electric dipole of moment equal to the 
real part of <a"|D|a'>. To interpret this rate of emission of radiant 
energy as a transition probability, we must divide it by the quantum 
of energy of this frequency, namely hv, and call it the probability per 
unit time of this quantum being spontaneously emitted, with the 
atomic system simultaneously dropping to the state a" of lower 
energy. These assumptions of Heisenberg are justified by the present 
radiation theory, supplemented by the spontaneous transition theory 
of Chapter X. 

46. Transitions caused by a perturbation independent of the 
time 

The perturbation method of § 44 is still valid when the perturbing 
energy V does not involve the time t explicitly. Since the total 
Hamiltonian H in this case does not involve t explicitly, we could 
now, if desired, deal with the system by the perturbation method of 
§ 43 and find its stationary states. Whether this method would be 
convenient or not would depend on what we want to find out about 
the system. If what we have to calculate makes an explicit reference 
to the time, e.g. if we have to calculate the probability of the system 
being in a certain state at one time when we are given that it is in a 
certain state at another time, the method of § 44 would be the more 
convenient one. 

Let us see what the result (24) for the transition probability becomes 
when V does not involve t explicitly and let us take t 0 = 0 to simplify 
the writing. The matrix element <a"|F|a'> is now independent of t, 
and from (18) = < a "|F|c/>e* E '-W, ( 35 ) 

t pi&r-Entihi 

so jviftn io'>^ = <o-H'io'> t ig _ g)/< 1 , 

provided E" ^ E'. Thus the transition probability (24) becomes 
P(oc'a) - |<a"17|a'>1]/( E"—E') 2 
= 2|< a "|F|aO| 2 [l—cos {(E''-E')tlK}]/{E"-E')z. 


(36) 
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If E" differs appreciably from E' this transition probability is small 
and remains so for all values of t. This result is required by the law 
of the conservation of energy. The total energy H is constant and 
hence the proper-energy E (i.e. the energy with neglect of the part 
V due to the perturbation), being approximately equal to H , must 
be approximately constant. This means that if E initially has the 
numerical value E at any later time there must be only a small 
probability of its having a numerical value differing considerably 
from E'. 

On the other hand, when the initial state a' is such that there exists 
another state a" having the same or very nearly the same proper- 
energy E, the probability of a transition to the final state a" may be 
quite large. The case of physical interest now is that in which there 
is a continuous range of final states a" having a continuous range of 
proper-energy levels E" passing through the value E f of the proper- 
energy of the initial state. The initial state must not be one of the 
continuous range of final states, but may be either a separate discrete 
state or one of another continuous range of states. We shall now have, 
remembering the rules of § 18 for the interpretation of probability 
amplitudes with continuous ranges of states, that, with P(a'a") 
having the value (36), the probability of a transition to a final state 
within the small range a" to oc'+da" will be P(a'a") dot' if the initial 
state ol is discrete and will be proportional to this quantity if <x is 
one of a continuous range. 

We may suppose that the a’s describing the final state consist of 
E together with a number of other dynamical variables /3, so that we 
have a representation like that of § 43 for the degenerate case, (The 
jS’s, however, need have no meaning for the initial state a'.) We shall 
suppose for definiteness that the /S’s have only discrete eigenvalues. 
The total probability of a transition to a final state a" for which the 
j8’s have the values j8" and E has any value (there will be a strong 
probability of its having a value near the initial value E') will now 
be (or be proportional to) 

J P(oc'oc") dE" 

= 2 J \<,E"^\V\a')\ 2 [l~oos{(E , '-E')tlh}]l(E"-E'f dE" (37) 
—00 

00 

= 2th- 1 j | <i?'+fe/i, /3" | F |a'> 1 2 [ 1—cos x]/x 2 dx 
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if one makes the substitution (E f, —E r )tjh = x. For large values of t 
this reduces to 

CO 

2tft- 1 \(E'l3 ,r \V\(x'}\ z J [1—cos x]/x 2 dx 

= 2'irtfi- 1 \(E f &"\V\oc'>\\ (38) 

Thus the total probability up to time t of a transition to a final state 
for which the j3’s have the values j8" is proportional to t. There is 
therefore a definite probability coefficient , or probability per unit time, 
for the transition process under consideration, having the value 

M-iKE'F |F|a'>| 2 . (39) 

It is proportional to the square of the modulus of the matrix element, 
associated with this transition, of the perturbing energy. 

If the matrix element (E'f3" \V \oc'y is small compared with other 
matrix elements of F, we must work with the more accurate formula 
(26). We have from (35) 
t v 

| W'\v*(t')\cc m y dt j <«*|7*(0I*'> 

0 0 

t V 

= <a" | V\a'"y(<x'" | F | oc'y J df J dt" 


<a"|F|a'"><a'"|F|a'> 

i{E"'—E')jh 


t 


For E" close to E\ only the first term in the integrand here gives rise 
to a transition probability of physical importance and the second 
term may be discarded. Using this result in (26) we get 
P(aV') 


= 2 


|F|a'>— J 


<q"jFlcQ<tt'"|Fla') 


E’"—E' 


-cos {{E"-E')tlK] 
{E"-E'f 


ql" 1 ¥ 1 qc.\qC 

which replaces (36). Proceeding as before, we obtain for the transi¬ 
tion probability per unit time to a final state for which the /3’s have 
the values and E has a value close to its initial value E' 


2it 

T 


< m s"if|c'>— 2 




(E'fi" | F | a^Xa"' | F | a' > 12 
E"'—E' 


(40) 


This formula shows how intermediate states, differing from the initial 
state and final state, play a role in the determination of a probability 
coefficient. 
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In order that the approximations used in deriving (39) and (40) may 
be valid, the time t must be not too small and not too large. It must 
be large compared with the periods of the atomic system in order that 
the approximate evaluation of the integral (37) leading to the result 
(38) may be valid, while it must not be excessively large or else the 
general formula (24) or (26) will break down. In fact one could make 
the probability (38) greater than unity by taking t large enough. The 
upper limit to t is fixed by the condition that the probability (24) or 
(26), or t times (39) or (40), must be small compared with unity. There 
is no difficulty in t satisfying both these conditions simultaneously 
provided the perturbing energy V is sufficiently small. 

47. The anomalous Zeeman effect 

One of the simplest examples of the perturbation method of § 43 
is the calculation of the first-order change in the energy-levels of an 
atom caused by a uniform magnetic field. The problem of a hydrogen 
atom in a uniform magnetic field has already been dealt with in § 41 
and was so simple that perturbation theory was unnecessary. The 
case of a general atom is not much more complicated when we make 
a few approximations such that we can set up a simple model for the 
atom. 

We first of all consider the atom in the absence of the magnetic 
field and look for constants of the motion or quantities that are 
approximately constants of the motion. The total angular momen¬ 
tum of the atom, the vector j say, is certainly a constant of the 
motion. This angular momentum may be regarded as the sum of two 
parts, the total orbital angular momentum of all the electrons, 1 say, 
and the total spin angular momentum, s say. Thus we have j = 1+s. 
Now the effect of the spin magnetic moments on the motion of the 
electrons is small compared with the effect of the Coulomb forces and 
may be neglected as a first approximation. With this approximation 
the spin angular momentum of each electron is a constant of the 
motion, there being no forces tending to change its orientation. Thus 
s, and hence also 1, will be constants of the motion. The magnitudes, 
l, s , and j say, of 1, s, and j will be given by 

i+\n = (i 2 x +il+il+W)*> 

s+Wi - {s%+sl+s%+W)K 

j+P = {51+31+31+W)\ 
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corresponding to equation (39) of § 36. They commute with each 
other, and from (47) of § 36 we see that with given numerical values 
for l and s the possible numerical values for j are 
Z-}-$, Z-f-s—|Z—sj. 

Let us consider a stationary state for which Z, 8, and j have definite 
numerical values in agreement with the above scheme. The energy 
of this state will depend on Z, but one might think that with neglect 
of the spin magnetic moments it would be independent of s , and 
also of the direction of the vector s relative to 1, and thus of j. It will 
be found in Chapter IX, however, that the energy depends very much 
on the magnitude s of the vector s, although independent of its 
direction when one neglects the spin magnetic moments, on account 
of certain phenomena arising from the fact that the electrons are 
indistinguishable one from another. There are thus different energy- 
levels of the system for each different value of Z and s. This means 
that l and s are functions of the energy, according to the general 
definition of a function given in § 11, since the Z and s of a stationary 
state are fixed when the energy of that state is fixed. 

We can now take into account the effect of the spin magnetic 
moments, treating it as a small perturbation according to the method 
of § 43. The energy of the unperturbed system will still be approxi¬ 
mately a constant of the motion and hence Z and s, being functions 
of this energy, will still be approximately constants of the motion. 
The directions of the vectors 1 and s, however, not being functions of 
the unperturbed energy, need not now be approximately constants 
of the motion and may undergo large secular variations. Since the 
vector j is constant, the only possible variation of 1 and s is a pre¬ 
cession about the vector j. We thus have an approximate model of 
the atom consisting of the two vectors 1 and s of constant lengths 
precessing about their sum j, which is a fixed vector. The energy is 
determined mainly by the magnitudes of 1 and s and depends only 
slightly on their relative directions, specified by j. Thus states with 
the same Z and s and different j will have only slightly different 
energy-levels, forming what is called a multiplet term. 

Let us now take this atomic model as our unperturbed system and 
suppose it to be subjected to a uniform magnetic field of magnitude M 
in the direction of the 2 -axis. The extra energy due to this magnetic, 
field will consist of a term 

e. &l2mc.(m z +fi<r z ), 


( 41 ) 
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like the last term in equation (89) of § 41, contributed by each 
electron, and will thus be altogether 

eJf/2mc. 2 (m z +ftcr 2 ) = eJ^J2mc.(l z +2s z ) = ej¥j2mc.{j z +s 0 ). (42) 

This is our perturbing energy 7. We shall now use the method of 
§ 43 to determine the changes in the energy-levels caused by this 7. 
The method will be legitimate only provided the field is so weak that 
F is small compared with the energy differences within a multiplet. 

Our unperturbed system is degenerate, on account of the direction 
of the vector j being undetermined. We must therefore take, from 
the representative of 7 in a Heisenberg representation for the un¬ 
perturbed system, those matrix elements that refer to one particular 
energy-level for their row and column, and obtain the eigenvalues of 
the matrix thus formed. We can do this best by first splitting up 7 
into two parts, one of which is a constant of the unperturbed motion, 
so that its representative contains only matrix elements referring to 
the same unperturbed energy-level for their row and column, ivhile 
the representative of the other contains only matrix elements refer¬ 
ring to two different unperturbed energy-levels for their row and 
column, so that this second part does not affect the first-order per¬ 
turbation. The term involving j z in (42) is a constant of the un¬ 
perturbed motion and thus belongs entirely to the first part. For the 
term involving s z we have 

Sz(jx~ i rjy~ J rjz) ~ Jzi^xjx ~b $yiy "7 $z3z) “b ( s zjx jz^x)jx~^~ ( s zjy jz s y)jy 

or 

Sz = jJ^i[3(j+^)-Kl+^)+s(s+h)]-[y y j x -yJ v ]j~^, (43) 

Yx = Sgjy. J Z Sy == S s ly Ig Sy = lyS z l z Sy, 

Yy ~ Jz s zjx ~ ^z s x $ z l x “ ^z s x ^x s z' 

The first term in this expression for s z is a constant of the unperturbed 
motion and thus belongs entirely to the first part, while the second 
term, as we shall now see, belongs entirely to the second part. 
Corresponding to (44) we can introduce 

Yz = 4c s y~ly s x' 

It can now easily be verified that 

jxYx+jyYy+jzYz = 0 

and from (30) of § 35 

[jz>Yx] = Yv> [jz>Y v ] = ~Yx> 
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These relations connecting j x , j yi j s and y x , y yi y z are of the same form 
as the relations connecting m x , m y , m z and y> z in the calculation 
in § 40 of the selection rule for the matrix elements of z in a repre¬ 
sentation with k diagonal. From the result there obtained that all 
matrix elements of z vanish except those referring to two k values 
differing by +%, we can infer that all matrix elements of y s , and 
similarly of y x and y yi in a representation with j diagonal, vanish 
except those referring to two j values differing by +%. The coeffi¬ 
cients of y x and y y in the second term on the right-hand side of (43) 
commute with j , so the representative of the whole of this term will 
contain only matrix elements referring to two j values differing by 
and thus referring to two different energy-levels of the unper¬ 
turbed system. 

Hence the perturbing energy V becomes, when we neglect that 
part of it whose representative consists of matrix elements referring 
to two different unperturbed energy-levels, 

* f j 1 ( 1 +%) + 5 ( 5 +^) 

2 mc Js \ 1 2j(j+%) 

The eigenvalues of this give the first-order changes in the energy- 
levels. We can make the representative of this expression diagonal 
by choosing our representation such that j z is diagonal, and it then 
gives us directly the first-order changes in the energy-levels caused by 
the magnetic field. This expression is known as Lande’s formula. 

The result (45) holds only provided the perturbing energy V is small 
compared with the energy differences within a multiplet. For larger 
values of V a more complicated theory is required. For very strong 
fields, however, for wilich V is large compared with the energy differ¬ 
ences within a multiplet, the theory is again very simple. We may 
now neglect altogether the energy of the spin magnetic moments for 
the atom with no external field, so that for our unperturbed system 
the vectors 1 and s themselves are constants of the motion, and not 
merely their magnitudes l and s. Our perturbing energy V, which is 
still eJ¥/ 2 mc. ( j z +s g )> now a constant of the motion for the unper¬ 
turbed system, so that its eigenvalues give directly the changes in the 
energy-levels. These eigenvalues are integral or half-odd integral 
multiples of eJ¥S/ 2 mc according to whether the number of electrons 
in the atom is even or odd. 




VIII 

COLLISION PROBLEMS 
48. General remarks 

Ik this chapter we shall investigate problems connected with a par¬ 
ticle which, coming from infinity, encounters or 'collides with 5 some 
atomic system and, after being scattered through a certain angle, goes 
off to infinity again. The atomic system which does the scattering 
we shall call, for brevity, the scatterer . We thus have a dynamical 
system composed of an incident particle and a scatterer interacting 
with each other, which we must deal with according to the laws of 
quantum mechanics, and for which we must, in particular, calculate 
the probability of scattering through any given angle. The scatterer 
is usually assumed to be of infinite mass and to be at rest throughout 
the scattering process. The problem was first solved by Born by a 
method substantially equivalent to that of the next section. We must 
take into account the possibility that the scatterer, considered as a 
system by itself, may have a number of different stationary states 
and that if it is initially in one of these states when the particle arrives 
from infinity, it may be left in a different one when the particle goes 
off to infinity again. The colliding particle may thus induce transi¬ 
tions in the scatterer. 

The Hamiltonian for the whole system of scatterer plus particle 
will not involve the time explicitly, so that this whole system will 
have stationary states represented by periodic solutions of Schro- 
dinger’s wave equation. The meaning of these stationary states 
requires a little care to be properly understood. It is evident that 
for any state of motion of the system the particle will spend nearly all 
its time at infinity, so that the time average of the probability of the 
particle being in any finite volume will be zero. Now for a stationary 
state the probability of the particle being in a given finite volume, 
like any other result of observation, must be independent of the time, 
and hence this probability will equal its time average, which we have 
seen is zero. Thus only the relative probabilities of the particle being 
in different finite volumes will be physically significant, their absolute 
values being all zero. The total energy of the system has a continuous 
range of eigenvalues, since the initial energy of the particle can be 
anything. Thus a ket, \s) say, corresponding to a stationary state, 
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being an eigenket of the total energy, must be of infinite length. We 
can see a physical reason for this, since if | s') were normalized and if 
Q denotes that observable—a certain function of the position of 
the particle—-that is equal to unity if the particle is in a given finite 
volume and zero otherwise, then <$|Q|s> would be zero, meaning that 
the average value of Q, i.e. the probability of the particle being in the 
given volume, is zero. Such a ket [ 5 ) would not be a convenient one 
to work with. However, with |s> of infinite length, <«s|#|«s> can be 
finite and would then give the relative probability of the particle 
being in the given volume. 

In picturing a state of a system corresponding to a ket \x > which 
is not normalized, but for which (x\xy = n say, it may be convenient 
to suppose that we have n similar systems all occupying the same 
space but with no interaction between them, so that each one follows 
out its own motion independently of the others, as we had in the 
theory of the Gibbs ensemble in § 33. We can then interpret <a;|a|:r>, 
where a is any observable, directly as the total a for all the n systems. 
In applying these ideas to the above-mentioned |s> of infinite length, 
corresponding to a stationary state of the system of scatterer plus 
colliding particle, we should picture an infinite number of such sys¬ 
tems with the scatterers all located at the same point and the particles 
distributed continuously throughout space. The number of particles 
in a given finite volume would be pictured as <s|Q|<s>, Q being the 
observable defined above, which has the value unity when the particle 
is in the given volume and zero otherwise. If the ket is represented 
by a Schrodinger wave function involving the Cartesian coordinates 
of the particle, then the square of the modulus of the wave function 
could be interpreted directly as the density of particles in the picture. 
One must remember, however, that each of these particles has its own 
individual scatterer. Different particles may belong to scatterers in 
different states. There will thus be one particle density for each state 
of the scatterer, namely the density of those particles belonging to 
scatterers in that state. This is taken account of by the wave function 
involving variables describing the state of the scatterer in addition 
to those describing the position of the particle. 

For determining scattering coefficients we have to investigate 
stationary states of the whole system of scatterer plus particle. For 
instance, if we want to determine the probability of scattering in 
various directions when the scatterer is initially in a given stationary 
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state and the incident particle has initially a given velocity in a given 
direction, we must investigate that stationary state of the whole 
system whose picture, according to the above method, contains at 
great distances from the point of location of the scatterers only 
particles moving with the given initial velocity and direction and 
belonging each to a scatterer in the given initial stationary state, 
together with particles moving outward from the point of location 
of the scatterers and belonging possibly to scatterers in various 
stationary states. This picture corresponds closely to the actual state 
of affairs in an experimental determination of scattering coefficients, 
with the difference that the picture really describes only one actual 
system of scatterer plus particle. The distribution of outward moving 
particles at infinity in the picture gives us immediately all the infor¬ 
mation about scattering coefficients that could be obtained by experi¬ 
ment. For practical calculations about the stationary state described 
by this picture one may use a perturbation method somewhat like 
that of § 43, taking as unperturbed system, for example, that for 
which there is no interaction between the scatterer and particle. 

In dealing with collision problems, a further possibility to be taken 
into consideration is that the scatterer may perhaps be capable of 
absorbing and re-emitting the particle. This possibility arises when 
there exists one or more states of absorption of the whole system, a 
state of absorption being an approximately stationary state which 
is closed in the sense mentioned at the end of § 38 (i.e. for which 
the probability of the particle being at a greater distance than r from 
the scatterer tends to zero as r co). Since a state of absorption is 
only approximately stationary, its property of being closed will be 
only a transient one, and after a sufficient lapse of time there will be 
a finite probability of the particle being on its way to infinity. 
Physically this means there is a finite probability of spontaneous 
emission of the particle. The fact that we had to use the word 
"approximately 5 in stating the conditions required for the phenomena 
of emission and absorption to be able to occur shows that these condi¬ 
tions are not expressible in exact mathematical language. One can give 
a meaning to these phenomena only with reference to a perturbation 
method. They occur when the unperturbed system (of scatterer plus 
particle) has stationary states that are closed. The introduction of the 
perturbation spoils the stationary property of these states and gives 
rise to spontaneous emission and its converse absorption. 
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For calculating absorption and emission probabilities it is necessary 
to deal with non-stationary states of the system, in contradistinction 
to the case for scattering coefficients, so that the perturbation method 
of § 44 must be used. Thus for calculating an emission coefficient 
we must consider the non-stationary states of absorption described 
above. Again, since an absorption is always followed by a re-emission, 
it cannot be distinguished from a scattering in any experiment in¬ 
volving a steady state of affairs, corresponding to a stationary state 
of the system. The distinction can be made only by reference to a 
non-steady state of affairs, e.g. by use of a stream of incident particles 
that has a sharp beginning, so that the scattered particles will appear 
immediately after the incident particles meet the scatterers, while 
those that have been absorbed and re-emitted will begin to appear 
only some time later. This stream of particles would be the picture 
of a certain ket of infinite length, which could be used for calculating 
the absorption coefficient. 

49. The scattering coefficient 

We shall now' consider the calculation of scattering coefficients, 
taking first the case when there is no absorption and emission, which 
means that our unperturbed system has no closed stationary states. 
We may conveniently take this unperturbed system to be that for 
which there is no interaction between the scatterer and particle. Its 
Hamiltonian wall thus be of the form 

E = H S +W, (1) 

where H s is that for the scatterer alone and W that for the particle 
alone, namely, with neglect of relativistic mechanics, 

W = ll2m.{pl+pl+p%). ( 2 ) 

The perturbing energy F, assumed small, will now be a function of 
the Cartesian coordinates of the particle x, y , z, and also, perhaps, 
of its momenta p y , p z , together with dynamical variables describ¬ 
ing the scatterer. 

Since we are now interested only in stationary states of the whole 
system, we use a perturbation method like that of § 43. Our unper¬ 
turbed system now necessarily has a continuous range of energy- 
levels, since it contains a free particle, and this gives rise to certain 
modifications in the perturbation method. The question of the change 
in the energy-levels caused by the perturbation, which was the main 
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question of § 43, no longer has a meaning, and the convention in § 43 
of using the same number of primes to denote nearly equal eigen¬ 
values of E and H now drops out. Again, the splitting of energy- 
levels which we had in § 43 when the unperturbed system is degenerate 
cannot now arise, since if the unperturbed system is degenerate the 
perturbed one, which must also have a continuous range of energy- 
levels, will also be degenerate to exactly the same extent. 

We again use the general scheme of equations developed at the 
beginning of § 43, equations (1) to (4) there, but we now take our 
unperturbed stationary state forming the zero-order approximation 
to belong to an energy-level E' just equal to the energy-level H’ of 
our perturbed stationary state. Thus the a’s introduced in the second 
of equations (3) § 43 are now all zero and the second of equations 
(4) there now reads (E'-E)\l> = F|0>. (3) 

Similarly, the third of equations (4) § 43 now reads 

(E'-E) |2>= 7|1>. (4) 

We shall proceed to solve equation (3) and to obtain the scattering 
coefficient to the first order. We shall need equation (4) in § 51. 

Let a denote a complete set of commuting observables describing 
the scatterer, which are constants of the motion when the scatterer is 
alone and may thus be used for labelling the stationary states of the 
scatterer. This requires that H s shall commute with the a’s and be 
a function of them. We can now take a representation of the whole 
system in which the a’s and x, y , z, the coordinates of the particle, 
are diagonal. This will make H s diagonal. Let |0> be represented by 
<xa'|0> and |1> by <Xa'|l>, the single variable x being written to 
denote x , y 3 z and the prime being omitted from x for brevity. In the 
same way the single differential d 3 x will be written to denote the 
product dxdydz . Equation (3), written in terms of representatives, 
becomes, with the help of (1) and (2), 

{ J E7'- J Er s (a , )+S 2 /2m.V 2 }<xa , |l> = 2 f <x«'|F|xV> #x"<xV| 0 >. 

a" * 

(5) 

Suppose that the incident particle has the momentum p° and that 
the initial stationary state of the scatterer is a 0 . The stationary state 
of our unperturbed system is now the one for which p = p° and 
a = a 0 , and hence its representative is 

<Xa'|0> = 


(6) 
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This makes equation (5) reduce to 

{E' -fi>') ,/2m. V 2 }<xa'11 > = J <x a '|F|x 0 a°> d 3 x° 


or 

(F+V 2 )<xa'[l> = F, 

(?) 

where 

& = 2 mh-*{E'-H s {<y!)} 

(8) 

and 

F = 2mh~ 2 j <x a '|F|xV>#x°e^ 0 >* 0 >/*, 

(9) 

a definite function of x, y, z, and a'. We must also have 



E' = J? s (a 0 ) + P o2 /2m. 

(10) 


Our problem now is to obtain a solution <xa'|l> of (7) which, for 
values of x, y, z denoting points far from the scatterer, represents 
only outward moving particles. The square of its modulus, |<xa' 11> | 2 , 
will then give the density of scattered particles belonging to scatterers 
in the state a when the density of the incident particles is |<xa°|0> [ 2 , 
which is unity. If we transform to polar coordinates r, 6, <j>, equation 
(7) becomes 


‘ 8r 2 * r dr 


1 sm8 8 


r 2 sin 9 39 89 r 2 sin 2 # 3cf>‘ 


^(rdcjxx' \ iy = F. ( 11 ) 


Now F must tend to zero as r ->oo, on account of the physical re¬ 
quirement that the interaction energy between the scatterer and 
particle must tend to zero as the distance between them tends to 
infinity. If we neglect F in (11) altogether, an approximate solution 


for large r is 


(rBfa’ |1> = u{9c/) 


( 12 ) 


where u is an arbitrary function of 9, <f>, and a', since this expression 
substituted in the left-hand side of (11) gives a result of order r~ 3 . 
When we do not neglect F , the solution of (11) will still be of the 
form (12) for large r, provided F tends to zero sufficiently rapidly as 
r oo, but the function u will now r be definite and determined by the 
solution for smaller values of r. 

For values a of the a’s such that & 2 , defined by (8), is positive, the 
^ in (12) must be chosen to be the positive square root of k 2 , in order 
that (12) may represent only outward moving particles, i.e. particles 
for which the radial component of momentum, which from § 38 
equals Py—iflr- 1 or — i^d/dr+r- 1 ), has a positive value. We now 
have that the density of scattered particles belonging to scatterers in 
state a, equal to the square of the modulus of (12), falls off with 
increasing r according to the inverse square law, as is physically 
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necessary, and their angular distribution is given by \u(6<f>oc')\ % . 
Further, the magnitude, P' say, of the momentum of these scattered 
particles must equal M , the momentum being radial for large r, 
so that their energy is equal to 

P'2 1.2# 2 *.02 

2m 2m 5X sV 1 sK } 1 2w 

with the help of (8) and (10). This is just the energy of an incident 
particle, namely p o2 /2m, reduced by the increase in energy of the 
scatterer, namely H s (ol')—H s ( a 0 ), in agreement with the law of con¬ 
servation of energy. For values ol' of the a’s such that k 2 is negative 
there are no scattered particles, the total initial energy being insuffi¬ 
cient for the scatterer to be left in the state a'. 

We must now evaluate u(8(j> a') for a set of values a' for the a’s such 
that k 2 is positive, and obtain the angular distribution of the scattered 
particles belonging to scatterers in state ex'. It is sufficient to evaluate 
u for the direction 8 = 0 of the pole of the polar coordinates, since 
this direction is arbitrary. We make use of Green’s theorem, which 
states that for any two functions of position A and B the volume 
integral J (AV 2 B—BV 2 A) d s x taken over any volume equals the 
surface integral J {AdBjdn—BdAjdn) dS taken over the boundary 
of the volume, 3/dn denoting differentiation along the normal to 
the surface. We take 

A = e- ikT cos \ B = (r8(f)cc 11 > 

and apply the theorem to a large sphere with the origin as centre. 
The volume integrand is thus 

e -ikr cos e V2 (rd^a 11 >— (rQ<j>oc 11 > V 2 e~ ikT cos 9 

= e~ ikr cos V 2 + k 2 )(r8<f>oi' 11 > = e ~ ikrcose F 
from (7) or (11), while the surface integrand is, with the help of (12), 

e -<faW £. M 11 >—<rfya'|1>£ e- ikrcosB 


: e -ikrcoa$ J _“ e ikT k cos 6 e~ ikrcosB 
\ r* r) r 

ikur-^l +cos d)e ikr( - 1 ~ 00sBl 


with neglect of r~ 2 . Hence we get 


an a 

J e -ikrcosd]p ^3 X __ j d(f> j r 2 sin8 dd . ikur^l + cos 8)e}‘ 


kril-co&d) 
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the volume integral on the left being taken over the whole of space. 
The right-hand side becomes, on being integrated by parts with 
respect to 6, 

27T r a \ 

J 64 |[m(1+cos d)e {kr0 -~ COB ^] B g Z”— j e'Wi- coae >^[M(l+cos6l)] 66 J. 

o 0 

The second term in the {} brackets is of the order of magnitude of 
r—*, as would be revealed by further partial integrations, and may 
therefore be neglected. We are thus left with 

277 

J e ~ikr cob dp ^3 X __ J d<j>u(0<f>oc') — — 47r&(0<£a'), 

giving the value of u(9<f)0i r ) for the direction 9 = 0. 

This result may be written 

tt(0* a ') - — (4 t r)- 1 J e-trrcosBitp d 3 X) (13) 

since F = Hi. If the vector p' denotes the momentum of the scattered 
electrons coming off in a certain direction (and is thus of magnitude 
jP') 3 the value of u for this direction will be 

u(0'fa') = -(477)" 1 J e-W^F d 3 x, 

as follows from (13) if one takes this direction to be the pole of the 
polar coordinates. This becomes, with the help of (9), 

u{d'<f>'cx f ) = J j* d 3 x <x«' |F|xV> 

= —27rwA<pV|Fip°a 0 >, (14) 

when one makes a transformation from the coordinates x to the 
momenta p of the particle, using the transformation function (54) 
of § 23. The single letter p is here used as a label for the three 
components of momentum. 

The density of scattered particles belonging to scatterers in state 
cl is now given by \u{6'^'ci)\ 2 jr 2 . Since their velocity is P'jm, the 
rate at which these particles appear per unit solid angle about the 
direction of the vector p' will be P'/m. (2i(0'<£V)| 2 . The density of 
the incident particles is, as we have seen, unity, so that the number 
of incident particles crossing unit area per unit time is equal to their 
velocity P°/m, where P° is the magnitude of p°. Hence the effective 
area that must be hit by an incident particle in order to be scattered 
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in a unit solid angle about the direction p' and then belong to a 
scatterer in state a! will be 

P'/P° . \u{d f (j> r oL f )\ 2 — 47T 2 m 2 h 2 P'/P°. |<pV|F!p°a°>| 2 . (15) 

This is the scattering coefficient for transitions a°-> a of the scatterer. 
It depends on that matrix element <pV | I 7 |p 0 a°> of the perturbing 
energy V whose column p°a° and whose row p V refer respectively to 
the initial and final states of the unperturbed system, between which 
the scattering transition process takes place. The result (15) is thus 
in some ways analogous to the result (24) of § 44, although the 
numerical coefficients are different in the two cases, corresponding 
to the different natures of the two transition processes. 

50. Solution with the momentum representation 

The result (15) for the scattering coefficient makes a reference only 
to that representation in which the momentum p is diagonal. One 
would thus expect to be able to get a more direct proof of the result 
by working all the time in the p-representation, instead of working 
in the x-representation and transforming at the end to the p-repre- 
sentation, as was done in § 49. This would not at first sight appear 
to be a great improvement, as the lack of directness of the x-repre¬ 
sentation method is offset by more direct applicability, it being 
possible to picture the square of the modulus of the x-representative 
of a state as the density of a stream of particles in process of being 
scattered. The x-representation method has, however, other more 
serious disadvantages. One of the main applications of the theory 
of collisions is to the case of photons as incident particles. Now a 
photon is not a simple particle but has a polarization. It is evident 
from classical electromagnetic theory that a photon with a definite 
momentum, i.e. one moving in a definite direction with a definite 
frequency, may have a definite state of polarization (linear, circular, 
etc.), while a photon with a definite position, which is to be pictured 
as an electromagnetic disturbance confined to a very small volume, 
cannot have any definite polarization. These facts mean that the 
polarization observable of a photon commutes with its momentum 
but not with, its position. This results in the p-representation method 
being immediately applicable to the case of photons, it being only 
necessary to introduce the polarizing variable into the representatives 
and treat it along with the as describing the scatterer, while the 

3595.57 
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x-representation method is not applicable. Further, in dealing with 
photons, it is necessary to take relativistic mechanics into account. 
This can easily be done in the p-representation method, but not so 
easily in the x-representation method. 

Equation (3) still holds with relativistic mechanics, but If is now 

given by pya/ c 2 = m V+P 2 = mW+pl+pl+p* (16) 

instead of by (2). Written in terms of p-representatives, equation (3) 
gives {j£'— H 8 (ql) — TF}<pa'11 > - <p*'|F|0>, 

p being written instead of p' for brevity and W being understood as 
a definite function of p x , p y) p z given by (16). This may be written 

(W* —F)<pa'|l> = <pa'|F|0), (V7) 

where W' = E'-H s (*') (18) 

and is the energy required by the law of conservation of energy for 
a scattered particle belonging to a scatterer in state a . The ket |0> 
is represented by (6) in the x-representation and the basic ket |p°a°) 
is represented by 

<Xa: , |p 0 a 0 > = S a - a o<x|p°> = 

from the transformation function (54) of § 23. Hence 

|0> = W|pW>, (19) 

and equation (17) may be written 

(F'--F)<p a '|l> = ^<P^|F|p 0 a 0 >. (20) 

We now make a transformation from the Cartesian coordinates 
p x , p y > p z of p to its polar coordinates P, a>, given by 

p x = P cos p y = P sinaicosx, p z = P sin a> sin x- 

If in the new representation we take the weight function P 2 sina>, 
then the weight attached to any volume of p-space will be the same 
as in the previous p-representation, so that the transformation will 
mean simply a relabelling of the rows and columns of the matrices 
without any alteration of the matrix elements. Thus (20) will become 
in the new representation 

(W'^WKPtox*' |1> = hKPoo X «\V\P°co<>x 0 ofi>, (21) 

W being now a function of the single variable P. 
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The coefficient of <P^ X oc'|1>, namely W—W, is now simply a 
multiplying factor and not a differential operator as it was with the 
x-representation method. We can therefore divide out by this factor 
and obtain an explicit expression for <Pa> X a'|l>* When, however, a 
is such that W', defined by (18), is greater than me 2 , this factor will 
have the value zero for a certain point in the domain of the variable 
P, namely the point P = P', given in terms of W' by (16). The 
function <Pa;xa'|l> will then have a singularity at this point. This 
singularity shows that <Papa'll> represents an infinite number of 
particles moving about at great distances from the scatterers with 
energies indefinitely close to W and it is therefore this singularity 
that we have to study to get the angular distribution of the particles 
at infinity. 

The result of dividing out (21) by the factor W'— W is, according 
to (13) of §15, 

<Pcu X a'|l> = )S(W r -W), 

( 22 ) 

where A is an arbitrary function of o), x , and a'. To give a meaning 
to the first term on the right-hand side of (22), we make the conven¬ 
tion that its integral with respect to P over a range that includes the 
value P' is the limit when e 0 of the integral when the small 
domain P'~ e to P '+€ is excluded from the range of integration. 
This is sufficient to make the meaning of (22) precise, since we are 
interested effectively only in the integrals of the representatives of 
states when the representation has continuous ranges of rows and 
columns. We see that equation (21) is inadequate to determine the 
representative <Pw X a'|l> completely, on account of the arbitrary 
function A occurring in (22). We must choose this A such that 
<(Pco X a'|l> represents only outward moving particles, since we want 
the only inward moving particles to be those corresponding to |0). 

Let us take first the general case when the representative (Pa) X j> 
of a state of the particle satisfies an equation of the type 

(W'~W)(Pa>x\y ==/(Po> x ), (23) 

where /(Pu> x ) is any function of P, oj, and x , and W r is a number 
greater than me 2 , so that <Po> x |> is of the form 

<P*> X |> =/(P<o X )/(^ (24) 

and let us determine now what A must be in order that <Po> x |) may 
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represent only outward moving particles. We can do this by trans¬ 
forming (Pwx |> to the x-representation, or rather the -repre¬ 
sentation, and comparing it with (12) for large values of r. The 
transformation function is 

(rd<j)\Pu>x> = h~h* V’ x) l n = h~h iPrlC0awCOB9+ain(o6in e C 08 (x ~^® fi . 

For the direction 6 — 0 we find 


2 77 77 


<r0(/> | > = h~* j P 2 dP | dx J sino) dco e iPrCOao>ln (Pa>x\y 

o o 

277 

r r ( r aiJPr cob toffi "la>=77 

“ j F ‘ iF j dx { - h»r <p " xl> L+ 


r pip 

+ do,-, 

J 2 


iPr cos wifi ft 

iPrjfo da) 


<P*X l> 


)• 


The second term in the { } brackets is of order r~ 2 , as may be verified 
by further partial integrations with respect to co, and can therefore 
be neglected. We are left with 

CO 277 

<r0<£!> = ih-^nr)- 1 J PdP J dx{e-^ ft <P7ry|>-e <p dfi<P0x|>} 


= ih-b- 1 f P dP{e- iPr l ll (Pnx\}—e iPrl \P0x\'>}. (25) 


When we substitute for (Pojx |> its value given by (24), the first 
term in the integrand in (25) gives 


ik-b- 1 j PdP e- iP ^{f(P7r X )l(W'- IF)-fA(77 X )3(lF'-lF)}. (26) 

0 

The term involving §(IF' — W) here may be integrated immediately 
and gives, when one uses the relation PdP — W dWjc 2 , which 
follows from (16), 

00 

iJHc-b- 1 |' W dW e- ip M\{TT X )8(W-W) 

VIC 2 

= ih- i -c- 2 r- 1 W'X(nx)e- iP ' r l h . (27) 
To integrate the other term in (26) we use the formula 


r p-iPrjTt p p-iPrjh 

J 9(P)^pdP = g(P') j 1—dP, 


(28) 
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with neglect of terms involving r- 1 , for any continuous function g{P), 

00 

which formula holds since [ K{P)e- iFT > h dP is of order r~ l for any 

o 

continuous function K(P) and since the difference 
g(P)l(P'-P)-g(P')l(P'-P) 

is continuous. The right-hand side of (28), when evaluated with 
neglect of terms involving r~\ and also with neglect of the small 
domain P' — e to P'-j-e in the domain of integration, gives 




e -iPrlh 

P'-P 


dP = g(P')e~ iP ' 


oo 

"*J 


e HP'-P)rlh 

P'-P 


dP 


= ig{P’)e- iP 'M | = i”9{P')e- iP ' rlh - (29) 

— 00 

In our present example g(P) is 

g(P) = ^-^P/(P7r X )(P / -P)/(TF / -F) J 
which has the limiting value when P = P' 5 

gr(P') = iA-ir-iP'/tPV^ITVP'c* - ih-*c-*r-Wf(P'irx). 

Substituting this in (29) and adding on the expression (27), we obtain 
the following value for the integral (26) 

A-^c~ 2 r“ilf / {^7r/(PVx)+^(7rx)} e -^. (30) 

Similarly the second term in the integrand in (25) gives 

/ i ^c-V-W / {~7r/(P , O x )-tA(Ox)K P '^. (31) 

The sum of these two expressions is the value of <r0$|> when r is 
large. 

We require that <r0^|) shall represent only outward moving 
particles, and hence it must be of the form of a multiple of e iP ' r l ft . 
Thus (30) must vanish, so that 

M^x) = —^/(P'ttx). (32) 

We see in this way that the condition that <rd$ | > shall represent 
only outward moving particles in the direction 6 = 0 fixes the value 
of A for the opposite direction 6 = n. Since the direction 9 = 0 or 
o> = 0 of the pole of our polar coordinates is not in any way singular, 
we can generalize (32) to 

A(<ax) = 


(33) 
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which gives the value of A for an arbitrary direction. This value 
substituted in (24) gives a result that may be written 

<Pu X |> =/(Po; X ){l/(Tf / -lf)--i7r8(Tf'--Tr)}, (34) 

since one can substitute P' for P in the coefficient of a term involving 
S(TF'—IF) as a factor without changing the value of the term. The 
condition that <Po>x|) shall represent only outward moving particles is 
thus that it shall contain the factor 

{lj(W'-W)--iirh(W'-W)}. (35) 

It is interesting to note that this factor is of the form of the right- 
hand side of equation (15) of § 15. 

With A given by (33), expression (30) vanishes and the value of 
<r0<£|> for large r is given by expression (31) alone, thus 

<r0<£|> = -2nh^ 2 r~ 1 W / f(P r 0 x )e iP ' r ^ 

This may be generalized to 

<rd(f> |> = -2^^c- 2 r“ 1 W , /(P'cox)e iPV/ ^, 

giving the value of <r8<f>\} for any direction 6, cf> in terms of f(P'a>x) 
for the same direction labelled by co, x . This is of the form (12) with 

u(6<f)) = — 27rh~ i c~ 2 W / f(P'cox) 

and thus represents a distribution of outward moving particles of 
momentum P' whose number is 


c 2 P' 

W' 


M 2 = 


4ttW'P' 
he 2 


\f(P'"X)\* 


(36) 


per unit solid angle per unit time. This distribution is the one 
represented by the <Pcuxl> of (34). 

Trom this general result we can infer that, whenever we have a 
representative representing only outward moving particles 

and satisfying an equation of the type (23), the number per unit solid 
angle per unit time of these particles is given by (36). If this < Peox |> 
occurs in a problem in which the number of incident particles is one 
per unit volume, it will correspond to a scattering coefficient of 
amount 


4irW° W'P' 
Ac 4 P° 


\f(P'*x)\ 2 - 


(37) 


It is only the value of the function f(Pwx) for the point P = P' that 
is of importance. 
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If we now apply this general theory to our equations (21) and 
(22), we have v ' 

f(P*>x) = hKPo> X oi f \V\P<>a><>x°<x«). 

Hence from (37) the scattering coefficient is 

4tt 2 A 2 W°W'P'jc*P°. | (P f u)x<x | V | P 0 co 0 x 0 oc 0 y | 2 . (38) 

If one neglects relativity and puts W°W f /c 4 = this result reduces 
to the result (15) obtained in the preceding section by mea ns of 
Green’s theorem. 

51. Dispersive scattering 

We shall now determine the scattering when the incident particle 
is capable of being absorbed, that is, when our unperturbed system 
of scatterer plus particle has closed stationary states with the particle 
absorbed. The existence of these closed states for the unperturbed 
system will be found to have a considerable effect on the scattering 
for the perturbed system, and indeed an effect that depends very 
much on the energy of the incident particle, giving rise to the pheno¬ 
menon of dispersion in optics when the incident particle is taken to 
be a photon. 

We use a representation for which the basic kets correspond to 
the stationary states of the unperturbed system, as was the case with 
the p-representation of the preceding section. We take these station¬ 
ary states to be the states (pV) for which the particle has a definite 
momentum p' and the scatterer is in a definite state ex', together with 
the closed states, k say, which form a separate discrete set, and 
assume that these states are all independent and orthogonal. This 
assumption is not accurate when the particle is an electron or atomic 
nucleus, since in this case for an absorbed state k the particle will 
still certainly be somewhere, so that one would expect to be able to 
expand | k} in terms of the eigenkets |x'a> of x> y , z, and the a’s, 
and hence also in terms of the |pV>’s. On the other hand, when the 
particle is a photon it will no longer exist for the absorbed states, 
which are then certainly independent of and orthogonal to the states 
(p'a / ) for which the particle does exist. Thus the assumption is valid 
in this case, which is an important practical one. 

Since we are concerned with scattering, we must still deal with 
stationary states of the whole system. We shall now, however, have 
to work to the second order of accuracy, so that we cannot use merely 
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the first-order equation (3), but must use also (4). Equation (3) 
becomes, when written in terms of representatives in our present 
representation, 

(W'-WKpa'liy = <pa'|F|0>, 

(E'—E k )(k\iy = <&|F|0>, 



where W' is the function of E' and the a' J s given by (18) and E k is the 
energy of the stationary state k of the unperturbed system. Similarly, 
equation (4) becomes 


(TF'—TF)<pa'|2> = <pa'|F|l>, 
(E’-E k )<jk |2> = <&|F|1>. 


(40) 


Expanding tlie right-hand sides by matrix multiplication, we get 


(F'-TF)<p«'|2> 

= I f <p*'|F|pV> d»p" <pV|l>+ 2 <p«'|F|F>a"|l> 

a' •' k" 

(E'~E k Km 

= 2 f <*|F|pV>d*p' <pV|l>+ 2 <£|F|F><fc"|l>. 

a' J k m 


(41) 


The ket |0) is still given by (19), so (39) may be written 


(W'-WKpa' |1> = M<p a '|F|p 0 a °>, (42) 

(E'-E k Kk\l> = A*<fc|F|p 0 a°>. (43) 

We may assume that the matrix elements <&'|F|&"> of F vardsh, 
since these matrix elements are not essential to the phenomena under 
investigation, and if they did not vanish it would mean simply that 
the absorbed states k had not been suitably chosen. We shall further 
assume that the matrix elements <p'a' | F |p"a"> are of the second order 
of smallness when the matrix elements <fc'|F|p"a">, <pV|F|fc"> are 
taken to be of the first order of smallness. This assumption will be 
justified for the case of photons in § 64. We now have from (43) and 
(42) that <i| 1> is of the first order of smallness, provided E' does not 
lie near one of the discrete set of energy-levels E k , and <pa'|l> is of 
the second order. The value of <pa'|2> to the second order will thus 
be given, from the first of equations (41), by 

(TF'-TF)<p a # i2> = A*^ <pa |F|F><F|F| p°a°y/{E f ~E k *). 
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The total correction in the wave function to the second order, namely 
<pa'|l> plus <pa'|2>, therefore satisfies 

(TF'—TF){< pa' 11>+<pa' |2» 

= A*{<pa'|F|p°«°>+ 2 <pa'|F|&)<*|F|p 0 a°>/( J B'-^)}. 

k 

This equation is of the type (23), provided a' is such that IF' > me 2 , 
which means that a' as a final state for the scatterer is not incon¬ 
sistent with the law of conservation of energy. We can therefore infer 
from the general result (37) that the scattering coefficient is 


4tt% 2 TF°1F'P' 

~ c 4p0 


<pV|F|pV>+2 


<p'«*'|F|i><ifc|F|pV> 


The scattering may now be considered as composed of two parts, 
a part that arises from the matrix element <pV|F|p°a 0 > of the per¬ 
turbing energy and a part that arises from the matrix elements 
<pV|F| k} and <&|F|p°a°>. The first part, which is the same as our 
previously obtained result (38), may be called the direct scattering. 
The second part may be considered as arising from an absorption of 
the incident particle into some state k , followed immediately by a 
re-emission in a different direction, and is like the transitions through 
an intermediate state considered in § 44. The fact that we have to 
add the two terms before taking the square of the modulus denotes 
interference between the two kinds of scattering. There is no experi¬ 
mental way of separating the two kinds, the distinction between 
them being only mathematical. 


52. Resonance scattering 

Suppose the energy of the incident particle to be varied con¬ 
tinuously while the initial state a 0 of the scatterer is kept fixed, so 
that the total energy E r or H ' varies continuously. The formula (44) 
now shows that as E' approaches one of the discrete set of energy- 
levels E k , the scattering becomes very large. In fact, according to 
formula (44) the scattering should be infinite when E' is exactly equal 
to an E k . An infinite scattering coefficient is, of course, physically 
impossible, so that we can infer that the approximations used in 
deriving (44) are no longer legitimate when E' is close to an E k . To 
investigate the scattering in this case we must therefore go back to 


the exact equation 


(. E f -~E)\H'} = v\H'y, 


equation (2) of § 43 with W written for H\ and use a different method 
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of approximating to its solution. This exact equation, written in 
terms of representatives like (41), becomes 

(W'~WXpoc'\H'y 

= 2 f <P«'|F|pV> <pV|jff')+ 2 (pcc'\V\k"Xk"\H'y, 

(45) 

= 2 f <i|F|pV>#p ff <pv|ff'>+ 2 a-|F|F><r|j?'>. 

Od' J AT ✓ 

Let us take one particular E k and consider the case when E' is close 
to it. The large term in the scattering coefficient (44) now arises from 
those elements of the matrix representing F that lie in row k or in 
column k, i.e. those of the type <&|F|pa:'> or <pa'|F|&>. The scatter¬ 
ing arising from the other matrix elements of F is of a smaller order 
of magnitude. This suggests that in our exact equations (45) we should 
make the approximation of neglecting all the matrix elements of F 
except the important ones, which are those of the type <pa'|F|&> or 
<fc|F|pa'>, where a' is a state of the scatterer that has not too much 
energy to be disallowed as a final state by the law of conservation of 
energy. These equations then reduce to 

(W'-WKpa’\H'> = <P*'\V\hXh\n'>, 

(E'-E k Kk\H'y = 2 f <&|F|pa'> d s p (pa'IB'y, 

a' J 

the a' summation being over those values of a for which W* given 
by (18) is > me 2 . These equations are now sufficiently simple for us 
to be able to solve exactly without further approximation. 

From the first of equations (46) we obtain by division 

<pa / |£ r/ > = <p a '|F|i)<i|5')/(F , ^TF)+AS(lF'^lF). (47) 

We must choose A, which may be any function of the momentum 
p and a , such that (47) represents the incident particles corresponding 
to |0> or A*|p°a°> together with only outward moving particles. [The 
representative of h* (p°a 0 > is actually of the form AS(1F'— W), since 
the conditions a' = a 0 and p = p° for it not to vanish lead to 
W = E'—H s (a) = E'—H s (oc°) = W° = W .] Thus (47) must be 

(pa'I#') = A*<pa'|p°a 0 >4" 

^-<paqFI&><fel£^'>{l/(TF , —TF)—^ 7 rS(F^ , —TF)}, (48) 

and from the general formula (37) the scattering coefficient will be 
4?r 2 W° W f P'/he* P °. |<pV|F|A>| 2 |<Jb|H , >| 2 . (49) 
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It remains for us to determine the value of <Jc\H'y. We can do this 
by substituting for <pa'|fF> in the second of equations (46) its value 
given by (48). This gives 

(E'-E k )<k\H‘y = A*<*|F|pV>+ 

+<&!#'> I f |<£|F|pa'>| 2 {l/(W'-W)-MrS(W'-W)}d s p 

CL* J 

= W<Jc\V\vWy+<Jc\H’y(a-ib), 

where a = 2 f |<*|F|p a '>| 2 d s pf(W'-W) (50) 

Of' J 

and 6 = tt2 J |<&|F|pa')| 2 8(]F'—JF) d 3 p 

= KMV\Pc»x«'>\ 2 8(W’-W)P 2 dPsinaj dcod x 

= v j? P'W'c~ 2 JJ |{&|F|P'a>ya , >| 2 sina> dwdx- (51) 

Thus <ij£T'> = A*<jfc|F|p°o: 0 >/(P'-P & -a+i6). (52) 

Note that a and b are real and that b is positive. 

This value for <&|iT> substituted in (49) gives for the scattering 
coefficient 

^irWWOW'P' |<pV|F|^)| 2 |<fe|F!p°o: 0 >i 2 

C 4 P° * ( ' 

One can obtain the total effective area that the incident particle 
must hit in order to be scattered anywhere by integrating (53) over 
all directions of scattering, i.e. by integrating over all directions of 
the vector p' with its magnitude kept fixed at P f , and then summing 
over all a that are to be taken into consideration, i.e. for which 
IF' > me 2 . This gives, with the help of (51), the result 

47tA 2 W° 6|<i|F(p°oc 0 >| 2 ( . 

c 2 jPo (W-E k ^a)*+b 2 ' 1 ) 


If we suppose W to vary continuously through the value E k> the 
main variation of (53) or (54) will be due to the small denominator 
(E r —E k — a) 2 +6 2 . If we neglect the dependence of the other factors 
in (53) and (54) on E then the maximum scattering will occur when 
E' has the value E k +a and the scattering will be half its maximum 
when E differs from this value by an amount b. The large amount of 
scattering that occurs for values of the energy of the incident particle 
that make E' nearly equal to E k give rise to the phenomenon of an 
absorption line. The centre of the line is displaced by an amount 
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a from the resonance energy of the incident particle, i.e. the energy 
which would make the total energy just E k , while the quantity b is 
what is sometimes called the half-width of the line. 

53. Emission and absorption 

For studying emission and absorption we must consider non¬ 
stationary states of the system and must use the perturbation method 
of § 44. To determine the coefficient of spontaneous emission we must 
take an initial state for which the particle is absorbed, corresponding 
to a ket | &>, and determine the probability that at some later time 
the particle shall be on its way to infinity with a definite momentum. 
The method of § 46 can now be applied. From the result (39) of that 
section we see that the probability per unit time per unit range of a> 
and x> of the particle being emitted in any direction a>', x with the 
scatterer being left in state a' is 

2 jrfi- 1 j < W'oj'x V | F |ife> \ 2 , (55) 

provided, of course, that a! is such that the energy W', given by (18), 
of the particle is greater than me 2 . For values of ol that do not satisfy 
this condition there is no emission possible. The matrix element 
<J>V'(jL)'xQL\V\k') here must refer to a representation in which W, co, 
and a are diagonal with the weight function unity. The matrix 
elements of F appearing in the three preceding sections refer to a repre¬ 
sentation in which p X9 p y , p z are diagonal with the weight function 
unity, or P, x are diagonal with the weight function P 2 sin to. 
They would thus refer to a representation in which W, to, x are 
diagonal with the weight function dPjd W. P 2 sin to = WP/e 2 .since. 
Thus the matrix element <TF'a/x'a'|F|Z;> in (55) is equal to 
(W'P'/c 2 .since')- times our previous matrix element < W'a> ' x w\v\k> 
or <pV|F|&>, so that (55) is equal to 

sinco'|<pV|F|&>| 2 . 

The probability of emission per unit solid angle per unit time, wuth 
the scatterer simultaneously dropping to state a', is thus 

%r W P' 

f ^i<pV|Fii>| 2 . (56) 

To obtain the total probability per unit time of the particle being 
emitted in any direction, with any final state for the scatterer, we 
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must integrate (56) over all angles a/, x sum over all states a 
whose energy H s (a) is such that flgfa'J+wc 2 < E k . The result is 
just 2 b/h, where b is defined by (51). There is thus this simple rela¬ 
tion between the total emission coefficient and the half-width b of the 
absorption line. 

Let us now consider absorption. This requires that we shall take 
an initial state for which the particle is certainly not absorbed but is 
incident with a definite momentum. Thus the ket corresponding to 
the initial state must be of the form (19). We must now determine 
the probability of the particle being absorbed after time t. Since our 
final state h is not one of a continuous range, we cannot use directly 
the result (39) of § 46. If, however, we take 

' |0> = |pV>, (57) 

as the ket corresponding to the initial state, the analysis of §§ 44 and 46 
is still applicable as far as equation (36) and shows us that the proba¬ 
bility of the particle being absorbed into state k after time t is 
2 K* I n pO a o> , 2[1 _ cos{(^—jS7')^}]/(^.— 

This corresponds to a distribution of incident particles of density 
h~ z , owing to the omission of the factor from (57), as compared 
with (19). The probability of there being an absorption after time 
t when there is one incident particle crossing unit area per unit time 
is therefore 

2 h*W 0 lc*P°. \ {k\V\ p°a°>| 2 [1 ^cos{(E k —E')tjti}]l(E k —E') % . (58) 
To obtain the absorption coefficient we must consider the incident 
particles not all to have exactly the same energy W° = E r ~H s (oc°), 
but to have a distribution of energy values about the correct value 
E k —H s (oc° ) required for absorption. If we take a beam of incident 
particles consisting of one crossing unit area per unit time per unit 
energy range, the probability of there being an absorption after time 
t will be given by the integral of (58) with respect to W . This integral 
may be evaluated in the same way as (37) of § 46 and is equal to 

47t 2 A 2 W°t/c 2 P° .\(k\V\ p°a°> | 2 . 

The probability per unit time of an absorption taking place with an 
incident beam of one particle per unit area per unit time per unit 
energy range is therefore 

47r% 2 JT°/c 2 P°. K^jFIpV)) 2 , (59) 

which is the absorption coefficient. 
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The connexion between the absorption and emission coefficients 
(59) and (56) and the resonance scattering coefficients calculated in 
the preceding section should be noted. When the incident beam does 
not consist of particles all with the same energy, but consists of a unit 
distribution of particles per unit energy range crossing unit area per 
unit time, the total number of incident particles with energies near 
an absorption line that get scattered will be given by the integral 
of (54) with respect to W. If one neglects the dependence of the 
numerator of (54) on E ', this integral will, since 

co 

f __ ir-rJE' = it, 

J (. E'-E k -a)*+b 2 

have just the value (59). Thus the total number of scattered particles 
in the neighbourhood of an absorption line is equal to the total number 
absorbed. We can therefore regard all these scattered particles as 
absorbed particles that are subsequently re-emitted in a different 
direction. Further, the number of particles in the neighbourhood of 
the absorption line that get scattered per unit solid angle about a 
given direction specified by p’ and then belong to scatterers in state 
a' will be given by the integral with respect to E' of (53), which 
integral has in the same way the value 

WV'W T . |<pV|FH>|»|<t|F-| P V>|». 

c 4 .r 0 o 

This is just equal to the absorption coefficient (59) multiplied by the 
emission coefficient (56) divided by 26/S, the total emission coefficient. 
This is in agreement with the point of view of regarding the resonance 
scattered particles as those that are absorbed and then re-emitted, 
with the absorption and emission processes governed independently 
each by its own probability law, since this point of view would 
make the fraction of the total number of absorbed particles that are 
re-emitted in a unit solid angle about a given direction just the 
emission coefficient for this direction divided by the total emission 
coefficient. 
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54. Symmetrical and antisymmetrical states 

If a system in atomic physics contains a number of particles of the 
same kind, e.g. a number of electrons, the particles are absolutely 
indistinguishable one from another. No observable change is made 
when two of them are interchanged. This circumstance gives rise to 
some curious phenomena in quantum mechanics having no analogue 
in the classical theory, which arise from the fact that in quantum 
mechanics a transition may occur resulting in merely the interchange 
of two similar particles, which transition then could not be detected 
by any observational means. A satisfactory theory ought, of course, 
to count two observationally indistinguishable states as the same 
state and to deny that any transition does occur when two similar 
particles exchange places. We shall find that it is possible to reformu¬ 
late the theory so that this is so. 

Suppose we have a system containing n similar particles. We may 
take as our dynamical variables a set of variables describing the 
first particle, the corresponding set £ z describing the second particle, 
and so on up to the set £ n describing the nth particle. We shall then 
have the <f/s commuting with the £fs for r ^ s. (We may require 
certain extra variables, describing what the system consists of in 
addition to the n similar particles, but it is not necessary to mention 
these explicitly in the present chapter.) The Hamiltonian describing 
the motion of the system will now be expressible as a function of the 
£ lt £■«• The fact that the particles are similar requires that the 
Hamiltonian shall be a symmetrical function of the £ v £ 2> ...,£ n , i.e. it 
shall remain unchanged when the sets of variables £ r are interchanged 
or permuted in any way. This condition must hold, no matter what 
perturbations are applied to the system. In fact, any quantity of 
physical significance must be a symmetrical function of the £’s. 

Let \af), |6 X >,... be kets for the first particle considered as a dynami¬ 
cal system by itself. There will be corresponding kets |a 2 >, |& 2 >,... for 
the second particle by itself, and so on. We can get a ket for the 
assembly by taking the product of kets for each particle by itself, 
for example 


K>|& 2 >|c 3 >...|£r Jl > = Ma c a-&»> 


(1) 
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say, according to the notation of (65) of § 20. The ket (1) corresponds 
to a special kind of state for the assembly, which may be described 
by saying that each particle is in its own state, corresponding to its 
own factor on the left-hand side of (1). The general ket for the 
assembly is of the form of a sum or integral of kets like (1), and 
corresponds to a state for the assembly for which one cannot say that 
each particle is in its own state, but only that each particle is partly 
in several states, in a way which is correlated with the other particles 
being partly in several states. If the kets | a ± y, are a set of 

basic kets for the first particle by itself, the kets |a 2 >, |6 2 >,... will be 
a set of basic kets for the second particle by itself, and so on, and the 
kets (1) will be a set of basic kets for the assembly. We call the repre¬ 
sentation provided by such basic kets for the assembly a symmetrical 
representation , as it treats all the particles on the same footing. 

In (1) we may interchange the kets for the first two particles and 
get another ket for the assembly, namely 

\bl)\ a z)\Cz)>"\Qri> — l^l a 2 C 3 — 


More generally, we may interchange the role of the first two particles 
in any ket for the assembly and get another ket for the assembly. 
The process of interchanging the first two particles is an operator 
which can be applied to kets for the assembly, and is evidently a 
linear operator, of the type dealt with in § 7. Similarly, the process 
of interchanging any pair of particles is a linear operator, and by 
repeated applications of such interchanges we get any permutation 
of the particles appearing as a linear operator which can be applied 
to kets for the assembly. A permutation is called an even permutation 
or an odd permutation according to whether it can be built up from 
an even or an odd number of interchanges. 

A ket for the assembly | X) is called symmetrical if it is unchanged 
by any permutation, i.e. if 

p\xy = |X> (2) 


for any permutation P. It is called antisymmetrical if it is unchanged 
by any even permutation and has its sign changed by any odd 


permutation, i.e. if 


P|Z>=±|Z>, 


(3) 


the + or — sign being taken according to whether P is even or odd. 
The state corresponding to a symmetrical ket is called a symmetrical 
state , and the state corresponding to an antisymmetrical ket is called 
an antisymmetrical state . In a symmetrical representation, the repre- 
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sentative of a symmetrical ket is a symmetrical function of the 
variables referring to the various particles and the representative of 
an antisymmetrical ket is an antisymmetrical function. 

In the Schrodinger picture, the ket corresponding to a state of the 
assembly will vary with time according to Schrodinger’s equation of 
motion. If it is initially symmetrical it must always remain sym¬ 
metrical, since, owing to the Hamiltonian being symmetrical, there 
is nothing to disturb the symmetry. Similarly if the ket is initially, 
antisymmetrical it must always remain antisymmetrical. Thus a 
state which is initially symmetrical always remains symmetrical and 
a state which is initially antisymmetrical always remains antisym¬ 
metrical. In consequence, it may be that for a particular kind of 
particle only symmetrical states occur in nature, or only anti¬ 
symmetrical states occur in nature. If either of these possibilities 
held, it would lead to certain special phenomena for the particles in 
question. 

Let us suppose first that only antisymmetrical states occur in 
nature. The ket (1) is not antisymmetrical and so does not corre¬ 
spond to a state occurring in nature. Trom (1) we can in general form 
an antisymmetrical ket by applying all possible permutations to it 
and adding the results, with the coefficient — 1 inserted before those 
terms arising from an odd permutation, so as to get 

2 ±P\a-xhc z ...g n y, (4) 

P 

the + or —* sign being taken according to whether P is even or odd. 
The ket (4) may be written as a determinant 

K> K> l%> • . • • K> 

l^l) l^2/ > 1 ^ 3 ) • • * 1^0 

i c i) \ c 2y \ c &y • * * fen) 

(5) 

l?l> l02> l^s) V • • 1 9n> 

and its representative in a symmetrical representation is a determi¬ 
nant. The ket (4) or (5) is not the general antisymmetrical ket, but 
is a specially simple one. It corresponds to a state for the assembly 
for which one can say that certain particle-states, namely the states 
are occupied, but one cannot say which particle is in 
which state, each particle being equally likely to be in any state. If 
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two of the particle-states a, 6 ,c ,...,<7 are the same, the ket (4) or (5) 
vanishes and does not correspond to any state for the assembly. 
Thus two particles cannot occupy the same state. More generally, the 
occupied states must be all independent , otherwise (4) or (5) vanishes. 
This is an important characteristic of particles for which only anti- 
symmetrical states occur in nature. It leads to a special statistics, 
which was first studied by Fermi, so we shall call particles for which 
only antisymmetrical states occur in nature fermions. 

Let us suppose now that only symmetrical states occur in nature. 
The ket ( 1 ) is not symmetrical, except in the special case when all the 
particle-states a,b,c,...,g are the same, but we can always obtain a 
symmetrical ket from it by applying all possible permutations to it 
and adding the results, so as to get 

2 P\ a ib 2 c z ...g n y. ( 6 ) 

p 

The ket ( 6 ) is not the general symmetrical ket, but is a specially 
simple one. It corresponds to a state for the assembly for which one 
can say that certain particle-states are occupied, namely the states 
a , 6 , c,...,g, without being able to say which particle is in which state. 
It is now possible for two or more of the states a,b,c,... } g to be the 
same, so that two or more particles can be in the same state. In spite 
of this, the statistics of the particles is not the same as the usual 
statistics of the classical theory. The new statistics was first studied 
by Bose, so we shall call particles for which only symmetrical states 
occur in nature bosons. 

We can see the difference of Bose statistics from the usual statistics 
by considering a special case—that of only two particles and only two 
independent states a and b for a particle. According to classical 
mechanics, if the assembly of two particles is in thermodynamic 
equilibrium at a high temperature, each particle will be equally likely 
to be in either state. There is thus a probability J of both particles 
being in state a , a probability \ of both particles being in state b , 
and a probability § of one particle being in each state. In the quan¬ 
tum theory there are three independent symmetrical states for the 
pair of particles, corresponding to the symmetrical kets \af)\af)> 
|&i> \b 2 }, and |«i>|6 2 )+ \af>\bf>, and describable as both particles in 
state a, both particles in state 6 , and one particle in each state 
respectively. For thermodynamic equilibrium at a high temperature 
these three states are equally probable, as was shown in § 33, so that 
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there is a probability \ of both particles being in state a , a probability 
J of both particles being in state b, and a probability | of one particle 
being in each state. Thus with Bose statistics the probability of two 
particles being in the same state is greater than with classical statistics . 
Bose statistics differ from classical statistics in the opposite direction 
to Fermi statistics, for which the probability of two particles being 
in the same state is zero. 

In building up a theory of atoms on the lines mentioned at the 
beginning of § 38, to get agreement with experiment one must assume 
that two electrons are never in the same state. This rule is known as 
Pauli's exclusion principle. It shows us that electrons are fermions . 
Planck’s law of radiation shows us that photons are bosons , as only the 
Bose statistics for photons will lead to Planck’s law. Similarly, for 
each of the other kinds of particle known in physics, there is experi¬ 
mental evidence to show either that they are fermions, or that they 
are bosons. Protons, neutrons, positrons are fermions, a-particles are 
bosons. It appears that all particles occurring in nature are either 
fermions or bosons, and thus only antisymmetrieal or symmetrical 
states for an assembly of similar particles are met with in practice. 
Other more complicated kinds of symmetry are possible mathemati¬ 
cally, but do not apply to any known particles. With a theory which 
allows only antisymmetrieal or only symmetrical states for a particu¬ 
lar kind of particle, one cannot make a distinction between two states 
which differ only through a permutation of the particles, so that the 
transitions mentioned at the beginning of this section disappear. 

55. Permutations as dynamical variables 

We shall now build up a general theory for a system containing n 
similar particles when states with any kind of symmetry properties 
are allowed, i.e. when there is no restriction to only symmetrical or 
only antisymmetrieal states. The general state now will not be sym¬ 
metrical or antisymmetrieal, nor will it be expressible linearly in 
terms of symmetrical and antisymmetrieal states when n > 2. This 
theory will not apply directly to any particles occurring in nature, 
but all the same it is useful for setting up an approximate treatment 
for an assembly of electrons, as will be shown in § 58. 

We have seen that each permutation P of the n particles is a linear 
operator which can be applied to any ket for the assembly. Hence 
we can regard P as a dynamical variable in our system of n particles. 
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There are n\ permutations, each of which can be regarded as a 
dynamical variable. One of them, P x say, is the identical permutation, 
which is equal to unity. The product of any two permutations is a 
third permutation and hence any function of the permutations is 
reducible to a linear function of them. Any permutation P has a 
reciprocal P” 1 satisfying 

pp-i = p-ip = P 1= l. 

A permutation P can be applied to a bra <X | for the assembly, 
to give another bra, which we shall denote for the present by P(X |. 
If P is applied to both factors of the product (X\ F>, the product 
must be unchanged, since it is just a number, independent of any 
order of the particles. Thus 

(P<X|)P|r> = <z|7> 

■showing that P(X\ = (X\P~ 1 (7) 

Now P<X| is the conjugate imaginary of P\X} and is thus equal to 
<X|P, and hence from (7) _ 

P = P- 1 . (8) 

Thus a permutation is not in general a real dynamical variable, its 
conjugate complex being equal to its reciprocal. 

Any permutation of the numbers 1, 2, 3,..., n may be expressed in 
the cyclic notation, e.g. with n = 8 

P a = (143)(27)(58)(6), (9) 

in which each number is to be replaced by the succeeding number in 
a bracket, unless it is the last in a bracket, when it is to be replaced 
by the first in that bracket. Thus P a changes the numbers 12345678 
into 47138625. The type of any permutation is specified by the 
partition of the number n which is provided by the number of num¬ 
bers in each of the brackets. Thus the type of P a is specified by the 
partition 8 = 3+2+2+1. Permutations of the same type, i.e. corre¬ 
sponding to the same partition, we shall call siwiilat. Thus, for 
example, P a in (9) is similar to 

P b = (871)(35)(46)(2). (10) 

The whole of the nl possible permutations may be divided into sets 
of similar permutations, each such set being called a class. The per¬ 
mutation P 1 = 1 forms a class by itself. Any permutation is similar 
to its reciprocal. 
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When two permutations P a and P h are similar, either of them P b 
may be obtained by making a certain permutation P x in the other 
P a . Thus, in our example (9), (10) we can take P x to be the permuta¬ 
tion that changes 14327586 into 87135462, i.e. the permutation 

P x = (18623)(475). 

Different ways of writing P a and P b in the cyclic notation would lead 
to different P x ’s. Any of these P x ’s applied to the product P a jX> 
would change it into P b .P x |X>, i.e. 

p x p a \xy = p b p x \x>. 

Hence P b = P z P a P~\ (11) 

which expresses the condition, for P a and P b to be similar as an 
algebraic equation. The existence of any P x satisfying (11) is suffi¬ 
cient to show that P a and P b are similar. 

56. Permutations as constants of the motion 

Any symmetrical function V of the dynamical variables of all the 
particles is unchanged by the application of any permutation P, so 
P applied to the product V\X} affects only the factor \X), thus 
PV |X> = VP |X>. 

Hence PV = VP , (12) 

showing that a symmetrical function of the dynamical variables com¬ 
mutes with every permutation. The Hamiltonian is a symmetrical 
function of the dynamical variables and thus commutes with every 
permutation. It follows that each permutation is a constant of the 
motion. This holds even if the Hamiltonian is not constant. If \Xt} 
is any solution of Schrodinger’s equation of motion, P\Xf) is another. 

In dealing with any system in quantum mechanics, when we have 
found a constant of the motion a, we know that if for any state of 
motion, a initially has the numerical value a', then it always has this 
value, so that we can assign different numbers a to the different 
states and so obtain a classification of the states. The procedure is 
not so straightforward, however, when we have several constants of 
the motion a which do not commute (as is the case with our permuta¬ 
tions P), since we cannot in general assign numerical values for all 
the a’s simultaneously to any state. Let us first take the case of a 
system whose Hamiltonian does not involve the time explicitly. The 
existence of constants of the motion a which do not commute is 
then a sign that the system is degenerate. This is because, for a 
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non-degenerate system, the Hamiltonian H by itself forms a complete 
set of commuting observables and hence, from Theorem 2 of § 19, each 
of the as is a function of H and therefore commutes with any other a. 

We must now look for a function /? of the as which has one and 
the same numerical value /S' for all those states belonging to one 
energy-level H', so that we can use for classifying the energy-levels 
of the system. We can express the condition for by saying that it 
must be a function of H and must therefore commute with every 
dynamical variable that commutes with H, i.e. with every constant 
of the motion. If the a’s are the only constants of the motion, or if 
they are a set that commute with all other independent constants of 
the motion, our problem reduces to finding a function of the as 
which commutes with all the as. We can then assign a numerical 
value /3' for ft to each energy-level of the system. If we can find 
several such functions /3, they must all commute with each other, so 
that we can give them all numerical values simultaneously. We ob¬ 
tain thus a classification of the energy-levels. When the Hamiltonian 
involves the time explicitly one cannot talk about energy-levels, but 
the /?’s will still give a useful classification of the states. 

We follow this method in dealing with our permutations P. We 
must find a function x of the P’s such that P^P” 1 = x f° r every P. 
It is evident that a possible x is 2 the sum °f the permutations 
in a certain class c, i.e. the sum of a set of similar permutations, since 
2 PP C P _1 must consist of the same permutations summed in a differ¬ 
ent order. There will be one such x for each class. Further, there can 
be no other independent x, since an arbitrary function of the P’s can 
be expressed as a linear function of them with numerical coefficients, 
and it will not then commute with every P unless the coefficients of 
similar P’s are always the same. We thus obtain all the x’s that can 
be used for classifying the states. It is convenient to define each x as 
an average instead of a sum, thus 


Xc = 2 p c> 

where n c is the number of P’s in the class c. An alternative expression 
for x c is 


Xc = Jr PP C P~\ 

p 


(13) 


the sum being extended over all the n\ permutations P, it being easy 
to verify that this sum contains each member of the class c the same 
number of times. For each permutation P there is one x, x(P) sa y> 
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equal to the average of all permutations similar to P. One of the 
X s is x(Pi) = !* 

The constants of the motion xv X^-^Xm obtained in this way will 
each have a definite numerical value for every stationary state of the 
system, in the case when the Hamiltonian does not involve the time 
explicitly, and also in the general case can be used for classifying 
the states, there being one set of states for every permissible set of 
numerical values xl>% 2 >-*->Xm f° r the x’s. Since the x’s are always 
constants of the motion, these sets of states will be exclusive , i.e. 
transitions will never take place from a state in one set to a state in 
another. 

The permissible sets of values x that one can give to the x’s are 
limited by the fact that there exist algebraic relations between the 
x’s. The product of any two y’s, x p Xq> * s 0 f course expressible as 
a linear function of the P’s, and since it commutes with every P it 
must be expressible as a linear function of the x’s, thus 

XpXq = a lXl+ a 2X2+- + a mXm> ( i4 ) 

where the a’ s are numbers. Any numerical values x that one gives 
to the x’s must be eigenvalues of the x’s and must satisfy these same 
algebraic equations. For every solution x of these equations there 
is one exclusive set of states. One solution is evidently Xp = 1 for 
every x P > giving the set of symmetrical states. A second obvious 
solution, giving the set of antisymmetrical states, is x P = ± 1 ? the 
+ or — sign being taken according to whether the permutations in 
the class p are even or odd. The other solutions may be worked out 
in any special case by ordinary algebraic methods, as the coefficients 
a in (14) may be obtained directly by a consideration of the types 
of permutation to which the x’s concerned refer. Any solution is, 
apart from a certain factor, what is called in group theory a character 
of the group of permutations. The x’s are a ll real dynamical variables, 
since each P and its conjugate complex P _1 are similar and will occur 
added together in the definition of any x> so that the x'’s must be all 
real numbers. 

The number of possible solutions of the equations (14) may easily 
be determined, since it must equal the number of different eigen¬ 
values of an arbitrary function B of the x’ s * We can express B as 
a linear function of the x’s with the help of equations (14); thus 

B = 61 X 1 + 62 X 2 +—^ -h&mXm- 
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Similarly, we can. express each of the quantities P 2 , B 3 ,..., B m as a 
linear function of the %’s. From the m equations thus obtained, 
together with the equation x(Pi) = 1, we can eliminate the m un¬ 
knowns Xv X&-'> Xnv obtaining as result an algebraic equation of 
degree m for B, 

B m +c 1 B^ 1 +c 2 B m ^+...+c m == 0 . 

The m solutions of this equation give the m possible eigenvalues 
for B, each of which will, according to (15), be a linear function of b v 
b m whose coefficients are a permissible set of values xi > Xm- 
The sets of values x ^ us obtained must be all different, since if 
there were fewer than m different permissible sets of values x f° r hb e 
x’s, there w T ould exist a linear function of the x’s every one of whose 
eigenvalues vanishes, which w’ould mean that the linear function itself 
vanishes and the x’s are not linearly independent. Thus the number of 
permissible sets of numerical values for the x’s is just equal to m, which 
is the number of classes of permutations or the number of partitions 
of n. This number is therefore the number of exclusive sets of states. 

All dynamical variables of physical importance and all observable 
quantities are symmetrical between the particles and thus commute 
with all the P’s. Thus the only functions of the P’s of physical 
importance are the x’s. The states corresponding to \x ) and to 
f(P)\x'y, w r here |x'> is any eigenket of the x’s belonging to the eigen¬ 
values x and/(P) is any function of the P’s such that/(P)|%'> # 0, 
are observationally indistinguishable and are thus physically equiva¬ 
lent. There is a definite number, n(x) say, of independent kets which 
can be formed by multiplying \x ) by functions of the P’s, which 
number depends only on the ^'’s. It is the number of rows and 
columns in a matrix representation of the P’s in which each x * s 
equal to x • If lx") corresponds to a stationary state, n(x) will be 
its degree of degeneracy (so far as concerns degeneracy caused by the 
symmetry between the particles). This degeneracy cannot be removed 
by any perturbation that is symmetrical between the particles. 

57. Determination of the energy-levels 

Let us apply the perturbation method of § 43 and make a first-order 
calculation of the energy-levels in the case when the Hamiltonian 
does not involve the time explicitly. We suppose that for our unper¬ 
turbed stationary states of the assembly each of the similar particles 
has its own individual state.. With n particles, we shall have n of 
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these states, corresponding to kets la 1 ), la 2 ),..., (a”) say, which we 
assume for the present to be all orthogonal. The ket for the assembly 

iS then l*> = l“i>|o|>—IO, (16) 

like (1) with a 1 , a 2 ,... instead of a, 6 ,.... If we apply any permutation 
P to it we get another ket 

P\xy = la 1 ) |a|>... |a^> (17) 

say, r, 5,..., z being some permutation of the numbers 1, 2,..., n, 
corresponding to another stationary state of the assembly with the 
same energy. There are thus altogether n ! unperturbed states with 
this energy, if we assume there are no other causes of degeneracy. 
According to the method of § 43 when the unperturbed system is 
degenerate, we must consider those elements of the matrix represent¬ 
ing the perturbing energy V that refer to two states with the same 
energy, i.e. those of the type (X \P a VP b \X}. These will form a matrix 
with n ! rows and columns, whose eigenvalues are the first-order 
corrections in the energy-levels. 

We must now introduce another kind of permutation operator 
which can be applied to kets of the form (17), namely a permutation 
which acts on the indices of the a’s. We denote such a permutation 
operator by P a . The essential difference between the P’s and the 
P a ’s may be seen in the following way. Let us consider a permutation 
in the general sense, say that consisting of the interchange of 2 and 3. 
This may be interpreted either as the interchange of the objects 2 and 
3 or as the interchange of the objects in the places 2 and 3, these two 
operations producing in general quite different results. The first of 
these interpretations is the one that gives the operators P, the objects 
concerned being the similar particles. A permutation P can be 
applied to an arbitrary ket for the assembly. A permutation with the 
second interpretation has a meaning, however, only when applied 
to a ket of the form (17), for which each of the particles is in a ‘place’ 
specified by an a, or to a sum of kets of the form (17). A permutation 
P may be considered as an ordinary dynamical variable. A permuta¬ 
tion P a may be considered as a dynamical variable in a restricted 
sense, valid when one is dealing only with states obtainable by super¬ 
position of the various states (17). This is the case for our present 
perturbation problem. 

We can form algebraic functions of the P a which will be other 
operators applicable to kets of the form (17). In particular we can 
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form x(P?) s the average of all P a ’s in a certain class c. This must 
equal x(JP), the average of the permutation operators P in the same 
class, since the total set of all permutations in a given class must 
evidently be the same whether the permutations are applied to the 
particles or to the places the particles are in. Any P commutes with 
any P“ i.e. P a P* = P£P a . (18) 

By labelling the a’s by the same numbers 1, 2, 3,..., n which label 
the particles, we set up a one-one correspondence between the a’s and 
the particles, so that given any permutation P a applying to the par¬ 
ticles, we can give a meaning to the same permutation P“ applying 
to the a’s. This meaning is such that, for the ket |X> given by (16), 


P“PJX> = |X>. (19) 

Since the various kets la 1 ), |a 2 ),... are orthogonal, |X> and P|X) are 
orthogonal unless P = 1. It follows that, for any coefficients c p , 

2 c p<X|P°PJX> = c Pa , (20) 

p 

provided |X> is normalized, the summation being over all the n\ 
permutations P or P a , with P a fixed. Now define V P by 

V P = <X|FP|X>. (21) 

We then have, for any two permutations P x and P yf 

<X\P X VP V \X) = (x\vp x p y \xy = v PxPy 
= Zv P <x\p*p x p y \x> 

P 

with the help of (20). Prom (18) this gives 

<X\P x VP y \X) = 2 Vp <X \P X P a P y |X>. (22) 

p 

We may write this result as 

Vtt^VpP*, (23) 

p 

where the sign ^ means an equation in a restricted sense, the 
operators on the two sides being equal so long as they are used only 
with kets of the form P|X> and their conjugate imaginary bras. 

The formula (23) shows that the perturbing energy V is equal, in 
the restricted sense, to a linear function of the permutation operators 
P a with coefficients V P given by (21). The restricted sense is adequate 
for the calculation of the first-order correction in the energy-levels, 
as this calculation involves only those matrix elements of V given by 



|57 DETERMINATION OF THE ENERGY-LEVELS 219 

(22). The formula (23) is a very convenient one because the expression 
on its right-hand side is easily handled. 

As an example of an application of (23) we shall determine the 
average energy of all those states, arising from the unperturbed state 
(16), that belong to one exclusive set. This requires us to calculate 
the average eigenvalue of V for those states (17) for which the x s 
have specified numerical values x- Now the average eigenvalue of 
Pa for any of these states equals that of for arbitrary 

P a and thus equals that of wF 1 ]? P a P£(P a )"\ which is y'(P“) or 

P a 

x'(P a )• Hence the average eigenvalue of V is £V P x'(P)- A similar 

p 

method could be used for calculating the average eigenvalue of any 
function of F, it being necessary only to replace each P a by x(P) to 
perform the averaging. 

The number of energy-levels in an exclusive set x~ X that arise 
from a given state of the unperturbed system is equal to the number 
of eigenvalues of the right-hand side of (23) that are consistent with 
the equations x = X • This number is the number n(x) introduced 
at the end of the preceding section, and is thus just the degree of 
degeneracy of the states in this set. 

We have assumed that the individual kets (a 1 ), ja 2 >,... which deter¬ 
mine the unperturbed state according to (16) are all orthogonal. The 
theory can easily be extended to the case when some of these kets are 
equal, any two that are not equal being still restricted to be orthogonal. 
We now have some permutations P a such that P a |X> = [X>, 
namely those permutations which involve only interchanges of 
equal oc’s. Equation (20) will now hold if the summation is extended 
only over those P’s which make P a |X> different. With this change 
in the meaning of > all the previous equations still hold, including 

p 

the result (23). For the present |X> there will be restrictions on the 
possible numerical values of the x’ s > e *g* they cannot have those 
values corresponding to |X> being antisymmetrical. 

58. Application to electrons 

Let us consider the case when the similar particles are electrons. 
This requires, according to Pauli’s exclusion principle discussed in 
§ 54, that we take into account only the antisymmetrical states. It 
is now necessary to make explicit reference to the fact that electrons 
have spins, which show themselves through an angular momentum 
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and a magnetic moment. The effect of the spin on the motion of 
an electron in an electromagnetic field is not very great. There 
are additional forces on the electron due to its magnetic moment, 
requiring additional terms in the Hamiltonian. The spin angular 
momentum does not have any direct action on the motion, but it comes 
into play when there are forces tending to rotate the magnetic moment, 
since the magnetic moment and angular momentum are constrained 
to be always in the same direction. In the absence of a strong 
magnetic field these effects are all small, of the same order of magni¬ 
tude as the corrections required by relativistic mechanics, and there 
would be no point in taking them into account in a non-relativistic 
theory. The importance of the spin lies not in these small effects on the 
motion of the electron, but in the fact that it gives two internal states 
to the electron, corresponding to the two possible values of the spin 
component in any assigned direction, which causes a doubling in the 
number of independent states of an electron. This fact has far-reaching 
consequences when combined with Pauli’s exclusion principle. 

In dealing with an assembly of electrons we have two kinds of 
dynamical variables. The first kind, which we may call the orbital 
variables , consists of the coordinates x, y , z of all the electrons and 
their conjugate momenta p x , p y , p z . The second kind consists of the 
spin variables, the variables a x , o y , a z , as introduced in § 37, for all 
the electrons. These two kinds of variables belong to different degrees 
of freedom. According to §§ 20 and 21, a ket fixing the state of the 
whole system may be of the form | A) |P>, where |A> is a ket referring 
to the orbital variables alone and |J5) is a ket referring to the spin 
variables alone, and the general ket fixing a state of the whole system 
is a sum or integral of kets of this form. This way of looking at things 
enables us to introduce two kinds of permutation operators, the first 
kind, P x say, applying to the orbital variables only and operating 
only on the factor | A} and the second kind, P° say, applying only 
to the spin variables and operating only on the factor |P>. The P x *s 
and P a ’s can each be applied to any ket for the whole system, not 
merely to certain special kets, like the P a ’s of the preceding section. 
The permutations P that we have had up to the present apply to all 
the dynamical variables of the particles concerned, so for electrons 
they will apply to both the orbital and the spin variables. This means 
that each P a equals the product 

p — px pa 
a ** a a' 


( 24 ) 
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We can now see the need for taking the spin variables into account 
when applying Pauli’s exclusion principle, eveii if we neglect the spin 
forces in the Hamiltonian. For any state occurring in nature each 
P a must have the value ±1, according to whether it is an even or 
an odd permutation, so from (24) 

P x a P°a=± 1 . (25) 

The theory of the three preceding sections would become trivial if 
applied directly to electrons, for which each P a = ± 1 . We may, 
however, apply it to the P x permutations of electrons. The P a ’s are 
constants of the motion if we neglect the terms in the Hamiltonian 
that arise from the spin forces, since this neglect results in the 
Hamiltonian not involving the spin dynamical variables a at all. The 
P x ’s must then also be constants of the motion. We can now intro¬ 
duce new x’s, equal to the average of all of the P x> s in each class, and 
assert that for any permissible set of numerical values x for these x’s 
there will be one exclusive set of states. Thus there exist exclusive sets 
of states for systems containing many electrons even when we restrict 
ourselves to a consideration of only those states that satisfy Pauli’s 
principle. The exclusiveness of the sets of states is now, of course, 
only approximate, since the x’s are constants only so long as w r e 
neglect the spin forces. There will actually he a small probability for 
a transition from a state in one set to a state in another. 

Equation (25) gives us a simple connexion between the P x) s and 
P a ’s, which means that instead of studying the dynamical variables 
P x we can get all the results we want, e.g. the characters x> by 
studying the dynamical variables P°\ The P a ’s are much easier to 
study on account of there being only two independent states of spin 
for each electron. This fact results in there being fewer characters x 
for the group of permutations of the a-variables than for the group 
of general permutations, since it prevents a ket in the spin variables 
from being antisymmetrical in more than two of them. 

The study of the P a ’s is made specially easy by the fact that we 
can express them as algebraic functions of the dynamical variables a. 
Consider the quantity 

O x2 = M l + a xl a xi+ G yl °yi +°il a zlS = &l + («i,«a)}- 
With the help of equations (50) and (51) of § 37 we find readily that 
( 01 , a 2 ) 2 = (a xl a x2 +cr yl cr v2 +<J zl a z2 f — 3— 2(a v a z ), (26) 
and hence that 


Oil = - 1. 


(27) 
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Again, we find 

@12°xl = \{ cr x l J r G x2~^°zl (y y2 J r^ iy yl a z , ^-> 
a x2 @12 ” i{ G x2 “1“ °xl "1“ i°yl a z2 ^°kf a y2$ 

and hence 0 12 cr xl = o xl 0 12 . 

Similar relations hold for a yl and a zl so that we have 

@12 a l ~ a 2 @12 
or @ 12 a l@ 12 1 = a 2 - 

From this we can obtain with the help of (27) 

@12 a 2 @12* ~ a l* 

These commutation relations for 0 12 with a x and a 2 are precisely the 
same as those for P x2 , the permutation consisting of the interchange 
of the spin variables of electrons 1 and 2. Thus we can put 

0 i2 = cPjg? 

where c is a number. Equation (27) shows that c = ±1- To deter¬ 
mine which of these values for c is the correct one, we observe that 
the eigenvalues of P x2 are 1, 1, 1, —1 3 corresponding to the fact that 
there exist three independent symmetrical and one antisymmetrical 
state in the spin variables of two electrons, namely, with the notation 
of § 37, the states represented by the three symmetrical functions 

/ a (^)/<x(< 4 )> /j 3 (^)/jS(< 4 )> f*( G zi)fp( G z 2 )+fp( G z i)/J< 4 )> and the one 
antisymmetrical function / a (a' 1 )/ jg (a' 2 ) — Tllus tlie mean 
of the eigenvalues of Pf 2 is Now the mean of the eigenvalues of 
(a x , a 2 ) is evidently zero and hence the mean of the eigenvalues of 0 12 
is J. Thus we must have c = +1, and so we can put 

P ?2 = « l + ("i,« 2 )}- (28) 

In this way any permutation P° consisting simply of an interchange 
can be expressed as an algebraic function of the a’s. Any other per¬ 
mutation P c can be expressed as a product of interchanges and can 
therefore also be expressed as a function of the a’s. With the help of 
(25) we can now express the P x " s as algebraic functions of the a’s and 
eliminate the P CT ’s from the discussion. We have, since the — sign 
must be taken in (25) when the permutations are interchanges and 
since the square of an interchange is unity, 

p u = —iU+K.Oa)}- (29) 

The formula (29) may conveniently be used for the evaluation of 
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the characters x which define the exclusive sets of states. We have, 
for example, for the permutations consisting of interchanges, 


X12 = x(Piz) = 



2 (< 


n{n—l)JZt 




If we introduce the dynamical variable s to describe the magnitude of 
the total spin angular momentum, \ 2 a r in units of ft, through the 


formula 


T 


in agreement with (39) of § 36, we have 


2 2 = (2 2 «<)- 2 (°r.®r) 

r<t \ r t f r 

= 4<s(«s-f~ 1 )— 3n. 

Hence 

v _ 1,, 4 s(s+1)-3w| »(»—4)+4s(«+1) 

Xl2 _-- { 1 + w(w _ 1} r - - 2n(n __ i} (30) 

Thus X 12 is expressible as a function of the dynamical variable s and 
of n the number of electrons. Any of the other ^’s could be evaluated 
on similar lines and would have to be a function of s and n only, since 
there are no other symmetrical functions of all the a dynamical 
variables which could be involved. There is therefore one set of 
numerical values x f° r x’ s > an d thus one exclusive set of states, 
for each eigenvalue s' of s. The eigenvalues of s are 

\n, \n— 1 ? \n— 2 , 

the series terminating with 0 or 

We see in this way that each of the stationary states of a system 
with several electrons is an eigenstate of s , the magnitude in units of 
ft of the total spin angular momentum b 2 belonging to a definite 

T 

eigenvalue s'. For any given s' there will be 2s' +1 possible values 
for a component of the total spin vector in any direction and these 
will correspond to 1 independent stationary states with the same 
energy. When we do not neglect the forces due to the spin magnetic 
moments these 2s' +1 states will in general be split up into 2 s '+1 
states with slightly different energies, and will thus form a multiplet 
of multiplicity 2 s'+1. Transitions in which s' changes, i.e. transitions 
from one multiplicity to another, cannot occur when the spin forces 
are neglected and will have only a small probability of occurrence 
when the spin forces are not neglected. 
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We can determine the energy-levels of a system with several 
electrons to the first approximation by applying the theory of the 
preceding section with the kets |a r > referring only to the orbital 
variables and using formula (23). If we consider only the Coulomb 
forces between the electrons, then the interaction energy V will 
consist of a sum of parts each referring to only two electrons, which 
will result in all the matrix elements V P vanishing except those for 
which P x is the identical permutation or is simply an interchange of 
two electrons. Thus (23) will reduce to 

F«K+2F rs P“, (31) 

r<s 

V rs being the matrix element referring to the interchange of electrons 
r and s. Since the P a ’s have the same properties as the P x % any 
function of the P a ’s will have the same eigenvalues as the corre¬ 
sponding function of the P x ’s, so that the right-hand side of (31) 
will have the same eigenvalues as 

v x +iv n n 

r<s 

or Vi-ilVU !+(•,.«.)} (32) 

r<s 

from (29). The eigenvalues of (32) will give the first-order corrections 
in the energy-levels. The form of (32) shows that a model which 
assumes a coupling energy between the spins of the various electrons, 
of magnitude —%V rs (a r ,o 8 ) for the electrons in the r and s orbital 
states, would meet with a fair amount of success. This coupling 
energy is much greater than that of the spin magnetic moments. Such 
models of the atom were in use before the justification by quantum 
mechanics w r as obtained. 

We may have two of the orbital states of the unperturbed system 
the same, i.e. the kets [a r > in the orbital variables for two electrons 
may be the same. Suppose la 1 ) and |a 2 > are the same. Then we must 
take only those eigenvalues of (31) that are consistent with PJ 2 == 1, 
or those eigenvalues of (32) that are consistent with Pf 2 = 1 or 
P ?2 = —I-* From (28) this condition gives (cq,cr 2 ) = —3, so that 
( a i J r°z) 2 = 0. Thus the resultant of the two spins g 1 and o 2 is zero, 
which may be interpreted as the spins a x and <j 2 being antiparallel. 
Thus we may say that two electrons in the same orbital state have 
their spins antiparallel. More than two electrons cannot be in the 
same orbital state. 
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THEORY OF RADIATION 
59. An assembly of bosons 

We consider a dynamical system composed of v! similar particles. 
We set up a representation for one of the particles with dis crete basic 
kets |a w >, . Then, as explained in § 54, we get a sym¬ 

metrical representation of the assembly of u' particles by taking as 
basic kets the products 

W>l<4>l‘4>-W'> = Kc4ag...c&> 1 (i) 

in which there is one factor for each particle, the suffixes 1, 2, 3,..., v! 
of the a’s being the labels of the particles and the indices a, b, c,..., g 
denoting indices ^in the basic kets for one particle. If the 
particles are bosons, so that only symmetrical states occur in nature, 
then we need to work with only the symmetrical kets that can be 
constructed from the kets (1). The states corresponding to these 
symmetrical kets will form a complete set of states for the assembly 
of bosons. We can build.up a theory of them as follows. 

We introduce the linear operator S defined by 

8 = u'H£P, (2) 

the sum being taken over all the v!\ permutations of the u* particles. 
Then 8 applied to any ket for the assembly gives a symmetrical ket. 
We may therefore call 8 the symmetrizing operator. From (8) of § 55 
it is real. Applied to the ket (1) it gives 

P\o%ol\ ocl...cx^y = S\o' a Qt b otP...a g }, (3) 

the labels of the particles being omitted on the right-hand side as 
they are no longer relevant. The ket (3) corresponds to a state for 
the assembly of v! bosons with a definite distribution of the bosons 
among the various boson states, without any particular boson being 
assigned to any particular state. The distribution of bosons is speci-. 
fied if we specify how many bosons are in each boson state. Let 
n[, n 2 , be the numbers of bosons in the states a (1) , cS 2 \ 
respectively with this distribution. The n n s are defined algebraically 
by the equation 

a®+a & + aC +—= n[ oP-\-n 2 . (4) 

The sum of the n n s is of course u\ The number of n ”s is equal to 
the number of basic kets |a (r) >, which in most applications of the 

3595.57 A 
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theory is very much greater than u\ so most of the n” s will be zero. 
If a°, oP, aP are all different, i.e. if the n” s are all 0 or 1 , the 
ket ( 3 ) is normalized, since in this case the terms on the left-hand 
side of ( 3 ) are all orthogonal to one another and each contributes 
^'l -1 to the squared length of the ket. However, if ot a , oc b , a c ,..., a? 
are not all different, those terms on the left-hand side of (3) will 
be equal which arise from permutations P which merely interchange 
bosons in the same state. The number of equal terms will be 
n{ \ n' 2 \ so the squared length of the ket (3) will be 

< 'cPoPotl c . . ,a° | S 2 1 cx a oc b cc c . .. oc g y = n[l n 2 l n 3 l.... (5) 

For dealing with a general state of the assembly we can introduce 
the numbers n x , n 2) n 3 ,... of bosons in the states a (1) , a (2) , 
respectively and treat the n’s as dynamical variables or as observ¬ 
ables. They have the eigenvalues 0 , 1 , 2 ,..., u'. The ket (3) is a 
simultaneous eigenket of all the n’s, belonging to the eigenvalues 
n 2 , %,.... The various kets (3) form a complete set for the 
dynamical system consisting of u' bosons, so the n’s all commute 
(see the converse to the theorem of § 13). Further, there is only one 
independent ket (3) belonging to any set of eigenvalues n x , n 2 , n 3 ,.... 
Hence the n’s form a complete set of commuting observables. If we 
normalize the kets (3) and then label the resulting kets by the 
eigenvalues of the n’s to which they belong, i.e. if we put 

(%! n 2 \ n’ z \...)~*S\u a oP(y?...oL o y = \n x n 2 (6) 

we get a set of kets \n x n 2 n 3 ...'), with the n n s taking on all non-negative 
integral values adding up to u’, which kets will form the basic kets 
of a representation with the n’s diagonal. 

The n’s can be expressed as functions of the observables a x , a 2 , 
# 3 ,..., <v which define the basic kets of the individual bosons by 
means of the equations 

= (V 

r 

or the equations J n a f(a a ) = £f(a r ) ( 8 ) 

a r 

holding for any function/. 

Let us now suppose that the number of bosons in the assembly is 
not given, but is variable. This number is then a dynamical variable 
or observable u, with eigenvalues 0 , 1 , 2 ,..., and the ket ( 3 ) is an 
eigenket of u belonging to the eigenvalue u’. To get a complete 
set of kets for our dynamical system we must now take all the 
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symmetrical kets (3) for all values of u'. We may arrange them in 
order thus ^ ^ £|a°a 6 >, SjtfA'), ( 9 ) 


where first is written the ket, with no label, corresponding to the 
state with no bosons present, then come the kets corresponding to 
states with one boson present, then those corresponding to states 
with two bosons, and so on. A general state corresponds to a ket 
which is a sum of the various kets (9). The kets (9) are all orthogonal 
to one another, two kets referring to the same number of bosons being 
orthogonal as before, and two referring to different numbers of bosons 
being orthogonal since they are eigenkets of u belonging to different 
eigenvalues. By normalizing all the kets (9), we get a set of kets like 
( 6 ) with no restriction on the n ”s (i.e. each n' taking on all non¬ 
negative integral values) and these kets form the basic kets of a 
representation with the n’s diagonal for the dynamical system con¬ 
sisting of a variable number of bosons. 

If there is no interaction between the bosons and if the basic kets 
|oc (1 )>, |a (2) >,... correspond to stationary states of a boson, the kets ( 9 ) 
will correspond to stationary states for the assembly of bosons. The 
number u of bosons is now constant in time, but it need not be a 
specified number, i.e. the general state is a superposition of states 
with various values for u. If the energy of one boson is H(oc ), the 
energy of the assembly will be 

2 = 2 n a H a (io) 

r a 

from ( 8 ), H a being short for the number H(cx a ). This gives the 
Hamiltonian for the assembly as a function of the dynamical 
variables n. 


60. The connexion between bosons and oscillators 

In § 34 we studied the harmonic oscillator, a dynamical system of 
one degree of freedom describable in terms of a canonical q and p , 
such that the Hamiltonian is a sum of squares of q and p, with 
numerical coefficients. We define a general oscillator mathematically 
as a system of one degree of freedom describable in terms of a 
canonical q and p } such that the Hamiltonian is a power series in q 
and p , and remains so if the system is perturbed in any way. We 
shall now study a dynamical system composed of several of these 
oscillators. We can describe each oscillator in terms of, instead of 
q and p } a complex dynamical variable r], like the 77 of § 34, and its 
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conjugate complex rj, satisfying the commutation relation (7) of 
§ 34. We attach labels 1, 2, 3,... to the different oscillators, so that 
the whole set of oscillators is describable in terms of the dynamical 
variables rj v tj 2 , ? rj 1} rj 2 , satisfying the commutation 


relations 

VaVb-VbVa = °> 

) 


Va Vb Vb Va ~ °» 

(11) 


VaVb VbVa ~ ^ ab• 

J 

Put 

VaVa = n a> 

(12) 

so that 

VaVa = n a +l. 

(13) 


The n’s are observables which commute with one another and the 
work of § 34 shows that each of them has as eigenvalues all non¬ 
negative integers. For the ath oscillator there is a standard ket, |0 a > 
say, which is a normalized eigenket of n a belonging to the eigenvalue 
zero. By multiplying all these standard kets together we get a 
standard ket for the set of oscillators, 

|0i>|0a>|0s>...= |0 1 0 2 0 3 ...>, (14) 

which is a simultaneous eigenket of all the n’s belonging to the 
eigenvalues zero. The standard ket (14) will be much used in the 
future and will be denoted simply by } s . From (13) of § 34 

Va>s=° (15) 

for any a . The work of § 34 also shows that, if n v n 2) %,... are any 
non-negative integers, (i 6) 

is a simultaneous eigenket of all the n’s belonging to the eigenvalues 

respectively. The various kets (16) obtained by taking 
different n n s form a complete set of kets all orthogonal to one another 
and the square of the length of one of them is, from (16) of § 34, 
n[l ri 2 \ n z L.. From this we see, bearing in mind the result (5), that 
the kets (16) have just the same properties as the kets (9), so that 
we can equate each ket (16) to the ket (9) referring to the same n f 
values without getting any inconsistency. This involves putting 

5|a a a 6 a c ...afi> = VaVbVc-Va>s- (1?) 

The standard ket > s becomes equal to the first of the kets (9), corre- 
sponding to no bosons present. 

The effect of equation (17) is to identify the states of an assembly 
of bosons with the states of a set of oscillators. This means that the 
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dynamical system consisting of an assembly of similar bosons is equiva¬ 
lent to the dynamical system consisting of a set of oscillators—the two 
systems are just the same system looked at from two different points of 
view. There is one oscillator associated with each independent boson 
state. We have here one of the most fundamental results of quantum 
mechanics, which enables a unification of the wave and corpuscular 
theories of light to be effected. 

Our work in the preceding section was built up on a discrete set 
of basic kets |a a > for a boson. We could pass to a different discrete 
set of basic kets, |/M> say, and build up a similar theory on them. 
The basic kets for the assembly would then be, instead of (9), 

|>, 8 S\/3 A fi B fi c y, .... (18) 

The first of the kets (18), referring to no bosons present, is the same 
as the first of the kets (9). Those kets (18) referring to one boson 
present are linear functions of those kets (9) referring to one boson 
present, namely ^ = j (19) 


and generally those kets (18) referring to u' bosons present are linear 
functions of those kets (9) referring to u ' bosons present. Associated 
with the new basic states \/3 A } for a boson there will be a new set 
of oscillator variables r\ A , and correspondirig to (17) we shall have 

— VA^BVc-^yS' ( 20 ) 


Thus a ket rj A r) 2 with u f factors rj A> must be a linear func¬ 
tion of kets 'rj a 7] b ...) s with u r factors rj a , rj bi .... It follows that each 
linear operator rj A must be a linear function of the rjf s. Equation 
(19) give, ^ = j 


and hence rj A — 2 ^ a <a a |/M>. 


( 21 ) 


Thus the rfs transform according to the same law as the basic kets for 
a boson . The transformed rf s satisfy, with their conjugate complexes, 
the same commutation relations (11) as the original ones. The trans¬ 
formed rj’s are on just the same footing as the original ones and hence, 
when we look upon our dynamical system as a set of oscillators, the 
different degrees of freedom have no invariant significance. 

The rf s transform according to the same law as the basic bras for 
a boson, and thus the same law as the numbers <a a |a;> forming the 
representative of a state x. This similarity people often describe by 
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saying that the y a ’s are given by a process of second quantization 
applied to <a a \x}, meaning thereby that, after one has set up a 
quantum theory for a single particle and so introduced the numbers 
<a a |a:> representing a state of the particle, one can make these num¬ 
bers into linear operators satisfying with their conjugate complexes 
the correct commutation relations, like (11), and one then has the 
appropriate mathematical basis for dealing with an assembly of the 
particles, provided they are bosons. There is a corresponding proce¬ 
dure for fermions, which will be given in § 65. 

Since an assembly of bosons is the same as a set of oscillators, it 
must be possible to express any symmetrical function of the boson 
variables in terms of the oscillator variables tj and rj. An example 
of this is provided by equation (10) with r) a fj a substituted for n a . 
Let us see how it goes in general. Take first the case of a function 
of the boson variables of the form 

= 2 ( 22 ) 

r 

where each U r is a function only of the dynamical variables of the 
rth boson, so that it has a representative <a®|t£|a£> referring to the 
basic kets \o%} of the rth boson. In order that U T may be symmetrical, 
this representative must be the same for all r, so that it can depend 
only on the two eigenvalues labelled by a and b. We may therefore 

* lit ® <o$|ff r l«S> = = <a\U\b} (23) 

for brevity. We have 

W<*..•> = 2 \oc£ccfr..ct..y<fi\u\x r y. (24) 

a 

Summing this equation for all values of r and applying the sym- 
-metrizing operator S to both sides, we get 

SU T (25) 

r a 

Since U T is symmetrical we can replace SU T by U T S and can then 
substitute for the symmetrical kets in (25) their values given by (17). 
We get in this way 

U tV Xi V X ,-->S = 22 'la Vx^Vx! Vx,-->S< a \U\%r'> 
a r 

= 2 Va 2 VxrVz, Vx t ->S Sbx r < a \ U \ b >’ (26) 

ab r 

Vx? meaning that the factor rj Xr must be cancelled out. Now from 
(15) and the commutation relations (11) 

Vb Vxi = 2 Vx^Vx! Vx^-^S %x r 


(27) 
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(note that rj b is like the operator of partial differentiation d/ 87 ] b ), so 
(26) becomes 

Vx,-->S = 2 r l a Vb' r lx 1 Vx 1 ->s\<*'\U r \ f >>. (28) 

ab 

The kets rj Xi 7 j Xi ...y s form a complete set, and hence we can infer from 
(28) the operator equation 

U T = 2. T)a<. a \U\byrj b . (29) 


This gives us U T in terms of the rj and rj variables and the matrix 
elements (a\TJ\by. 

Now let us take a symmetrical function of the boson variables 
consisting of a sum of terms each referring to two bosons, 

Vt = 2 V rs . (30) 

r,s^r 


We do not need to assume V rs 
matrix elements 


Vgy. Corresponding to (23), V rs has 

<(4alt\V„\(4(4> = <ab\V\cdy (31) 

for brevity. Proceeding as before we get, corresponding to (25), 

SV T [of of 8 ...) = 2 2 SI 0 ? 0 ?-*?- >*\-^<ab\V\x T x s y (32) 

r,s^r ab 


and corresponding to (26) 


V T r )x 1 Vx,->S = 2VaVb 2 ^xhx, 1 V Xl Vx,->S 8 cxr 8 dx.< ab \ V \ cd >- (33) 

abed r y s^r 

We can deduce as an extension of (27) 


VcVd r lx 1 Vx 3 ->S = 2 yXr 1 Vx. 1 ‘nx l 7 lx,->S Scxr 8 dx,’ (34) 

r,s#r 

so that (33) becomes 

= 2 VaVbVcVdV Xl Vx,->s<ab\v\cd>, 


giving us the operator equation 

V T = 2 7 laVb( ab \^\ cd )Vc :; id- (35) 

abed 

The method can readily be extended to give any symmetrical func¬ 
tion of the boson variables in terms of the t? j s and ifs. 

The foregoing theory can easily be generalized to apply to an 
assembly of bosons in interaction with some other dynamical system, 
which we shall call for definiteness the atom. We must introduce a 
set of basic kets, |D say, for the atom alone. We can then get a set 
of basic kets for the whole system of atom and bosons together by 
multiplying each of the kets |D into each of the kets (9). We may 
write these kets 

ID, |D a >, S|jw>, .... 


(36) 
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We may look upon the system as composed of the atom in interaction 
with a set of oscillators, so that it can be described in terms of the 
atom variables and the oscillator variables r la , rj a . U sing again the 
standard ket y s for the set of oscillators, we have 

£|£W-a c .~> = (37) 

corresponding to (17), as the equation expressing the basic kets 
(36) in terms of the oscillator variables. 

Any function of the atom variables and boson variables which is 
symmetrical between all the bosons is expressible as a function of the 
atom variables and the t?’s and rj’s. Consider first a function U T of 
the form (22) with U r a function only of the atom variables and the 
variables of the rth boson, so that it has a representative <&'c%\U r \ £"a®>. 
This representative must be independent of r in order that XJ T may 
be symmetrical between all the bosons, so we may write it 
<(£ a a |F|£ a. b y. Now let us define <a|F|6)> to be that function of the 
atom variables whose representative is ^VIF^'a 6 ), so that we have 

WreiCV) = <ra a |F|£V> = <ri<a|F|6>|r>, (38) 

corresponding to (23). The equations (24)-(28) can now be taken over 
and applied to the present work if both sides of all these equations 
are multiplied by |£'> on the right, with the result that formula (29) 
still holds. We can deal similarly with a symmetrical function V T of 
the form (30) with V rs a function only of the atom variables and the 
variables of the rth and sth bosons. Defining (ab\V\cdy to be that 
function of the atom variables whose representative is 

<X'^4\V ra \t"^<4y, 
we find that formula (35) still holds. 

61. Emission and absorption of bosons 

Let us suppose that the oscillators of the preceding section are 
harmonic oscillators and there is no interaction between them. The 
energy of the ath oscillator is then, from (5) of § 34, 

H a = ^aVaVa+i^a- 

We shall neglect the constant term \Ha> a , which is the energy of the 
oscillator in its lowest state—the so-called ‘zero-point energy’. This 
neglect does not have any dynamical consequences, as explained at 
the beginning of § 30, and merely involves a redefinition of H a . The 
total energy of all the oscillators is now 

H T = 1 H a = 2 1iaj a r) a rj a = 2 hoj a n a 

G> a n 


( 39 ' 
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with the help of ( 12 ). This is of the same form as ( 10 ), with ho> a for 
H a . Thus a set of harmonic oscillators is equivalent to an assembly of 
bosons in stationary states with no interaction between them . If an 
oscillator of the set is in its n'th quantum state , there are n* bosons in 
the associated boson state . 

In general the Hamiltonian for the set of oscillators will be a power 
series in the variables yj a) rj a , say 

H t = H p + 2 {U a r) a +U a ij a )+ 2 + 

a ab 

_ (40) 

where H P , U a , U ab , V ah are numbers, H P being real and U ab = U ba . If 
the set of oscillators are in interaction with an atom, as we had at 
the end of the preceding section, the total Hamiltonian will still be 
of the form (40), with H P , U a , U ab , V ab functions of the atom variables, 
H P in particular being the Hamiltonian for the atom by itself. A 
general treatment of this dynamical system would be rather compli¬ 
cated and for practical applications one assumes that the terms 

H P+ 1 UaaVaVa ( 41 ) 

are large compared with the others and form by themselves an 
unperturbed system, the remaining terms being taken into account 
as a perturbation producing transitions in the unperturbed system, 
according to the theory of § 44. If, further, U aa is independent of the 
atom variables, the unperturbed system with Hamiltonian (41) con¬ 
sists merely of an atom with Hamiltonian H P and an assembly of 
bosons in stationary states with Hamiltonian of the form (39), with 
no interaction. 

Let us consider what kinds of transitions are produced by the 
various perturbation terms in (40). Take a stationary state of the 
unperturbed system for w r hich the atom is in a stationary state, £' say, 
and bosons are present in the stationary boson states, a , 6 , c,.... This 
stationary state for the unperturbed system corresponds to the ket 

VaVbVc-yslO, ( 42 ) 

like (37). If the term U x yj x of (40) is multiplied into this ket, the 
result is a linear combination of kets like 

VxValbVc->s\C'>> ( 43 ) 

4 " denoting any stationary state of the atom. The ket (43) refers to 
one more boson than the ket (42), the extra boson being in the state z. 
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Thus the perturbation term U x rj x gives rise to transitions in which 
one boson is emitted into state x and the atom makes an arbitrary 
jump. If the term U x rj x of (40) is multiplied into (42), the result is 
zero unless (42) contains a factor rj x and is then a linear combination 


of kets like 


Vx ^Va 


referring to one boson less in state x. Thus the perturbation term 
U x rj x gives rise to transitions in which one boson is absorbed from 
state x , the atom again making an arbitrary jump. Similarly, we find 
that a perturbation term U xy rj x rj y (x =£ y) gives rise to processes in 
which a boson is absorbed from state y and one is emitted into state 
x , or, what is the same thing physically, one boson makes a transition 
from state y to state x. This kind of process would be produced by 
a term like the U T of (22) and (29) in the perturbation energy, pro¬ 
vided the diagonal elements (a | U |a> vanish. Again, the perturbation 
terms t] x rj yj fj x rj y give rise to processes in which two bosons are 
emitted or absorbed, and so on for more complicated terms. With 
any of these emission and absorption processes the atom can make 
an arbitrary jump. 

Let us determine how the probability of occurrence of each of these 
transition processes depends on the numbers of bosons originally 
present in the various boson states. From §§ 44, 46 the transition 
probability is always proportional to the square of the modulus of 
the matrix element of the perturbation energy referring to the two 
states concerned. Thus the probability of a boson being emitted into 
state x with the atom making a jump from state £' to state £" is 


proportional to 

i<n<r4^..(«;+i)-ic4^K^-<->ir>i 2 , ( 44 ) 

the n n s being the numbers of bosons initially present in the various 
boson states. Now from (6) and (17), with reference to (4), 

\n 1 n % n z ... s ) == (%! .(^5) 

so that ifeK= «+l)*|«..(4+!)..). (46) 

Hence (44) is equal to 

K+i)I<nt4ioi 2 , (47) 


showing that the probability of a transition in which a boson is emitted 
into state x is proportional to the number of bosons originally in state x 
plus one. 
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The probability of a boson being absorbed from State x with the 
atom making a jump from state £' to state £" is proportional to 

i<n<^^..K-i)..|C4^K^..<..>ir>i 2 J ( 48 ) 

the n” s again being the numbers of bosons initially present in the 
various boson states. Now from (45) 

y x \n' 1 ri i ..n' x ..)> = n'f\n' l n i ,.(n' x —\)..'} s , (49) 

so (48) is equal to n' x |<riW>|*. (50) 


Thus the probability of a transition in which a boson is absorbed from 
state x is proportional to the number of bosons originally in state x. 

Similar methods may be applied to more complicated processes, 
and show that the probability of a process in which a boson makes 
a transition from state y to state x (x ^ y) is proportional to n' y {n x -\-\). 
More generally, the probability of a process in which bosons are 
absorbed from states x, y,... and emitted into states a, b,... is propor¬ 
tional to n' x n' y ...(n' a +l)(n b +l)..., (51) 

the n” s being in each case the numbers of bosons originally present. 
These results hold both for direct transition processes and transition 
processes that take place through one or more intermediate states, 
in accordance with the interpretation given at the end of § 44. 

62. Application to photons 

Since photons are bosons, the foregoing theory can be applied to 
them. A photon is in a stationary state when it is in an eigenstate 
of momentum. It then has two independent states of polarization, 
which may be taken to be two perpendicular states of linear polariza¬ 
tion. The dynamical variables needed to describe the stationary 
states are then the momentum p, a vector, and a polarization variable 
1 , consisting of a unit vector perpendicular to p. The variables p and 
1 take the place of our previous a’s. The eigenvalues of p consist of 
all numbers from —oo to co for each of the three Cartesian com¬ 
ponents of p, while for each eigenvalue p' of p, 1 has just two 
eigenvalues, namely two arbitrarily chosen vectors perpendicular 
to p' and to one another. Owing to the eigenvalues of p forming 
a continuous range, there are a continuous range of stationary 
states, giving us the continuous basic kets |pT>. However, the fore¬ 
going theory was built up in terms of discrete basic kets |a > for a 
boson. There are two formalisms which one may use for getting over 
this discrepancy. 
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The first consists in replacing the continuous three-dimensional 
distribution of eigenvalues for p by a large number of discrete points 
lying very close together, forming a dust spread over the whole three- 
dimensional p-space. Let s p , be the density of the dtist (the number 
of points per unit volume) in the neighbourhood of any point p'. 
Then s p , must be large and positive, but is otherwise an arbitrary 
function of p'. An integral over the p-space may be replaced by a 
sum over the dust of points, in accordance with the formula 

/JJ/(P') dp' x dp' y dp' s = 2/(p>p-\ (52) 


which formula provides the basis of the passage from continuous p' 
values to discrete ones and vice versa. Any problem can be worked 
out in terms of the discrete p' values, for which the theory of §§ 59-61 
can be used, and the results can be transformed back to refer to con¬ 
tinuous p' values. The arbitrary density s p > should then disappear 
from the results. 

The second formalism consists in modifying the equations of the 
theory of §§ 59-61 so as to make them apply to the case of a con¬ 
tinuous range of basic kets |a')> by replacing sums by integrals and 
replacing the S symbol in the commutation relations (11) by 8 func¬ 
tions, so far as concerns the variables with continuous eigenvalues. 
Each of these formalisms has some advantages and some disadvan¬ 
tages. The first is usually more convenient for physical discussion, 
the second for mathematical development. Both will be developed 
here and one or other will be used according to which is more suitable 
at the moment. 


The Hamiltonian describing an assembly of photons interacting 
with an atom will be of the general form (40), with the coefficients 
£T P , U a , U ab , V ab involving the atom variables. This Hamiltonian may 


be written 


H t = H p +H q +H r , 


(53) 


where H p is the energy of the atom alone, H R is the energy of the 
assembly of photons alone, 


H B = H n (54) 

PT 

being the frequency of a photon of momentum p', and H Q is the 
interaction energy, which can be evaluated from analogy with the 
classical theory, as will be shown in the next section. The whole^ 
system can be treated by a perturbation method as discussed in the 
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preceding section, H P and H R providing the energy (41) of the 
unperturbed system and Hq being the perturbation energy, which 
gives rise to transition processes in which photons are emitted and 
absorbed and the atom jumps from one stationary state to another. 

We saw in the preceding section that the probability of an absorp¬ 
tion process is proportional to the number of bosons originally in the 
state from which a boson is absorbed. From this we can infer that 
the probability of a photon being absorbed from a beam of radiation 
incident on an atom is proportional to the intensity of the beam. 
We also saw that the probability of an emission process is propor¬ 
tional to the number of bosons originally in the state concerned plus 
one. To interpret this result we must make a careful study of the 
relations involved in replacing the continuous range of photon states 
by a discrete set. 

Let us neglect for the present the polarization variable 1. Let 
|p'D> be the normalized ket corresponding to the discrete photon 
state p'. Then from (22) of § 16 

2 Ip'dXp'dI = 1, 

which gives from (52) 

/ Ip'd><P'd| V^ S P'= L (65) 


d z p' being written for dp x dp' y dp z , for brevity, 
ket corresponding to the continuous state p' 
(24) of §16 , 

/ IP'XP'I <* 3 P' = l, 


Now if |p'> is the basic 
we have according to 


which shows, on comparison with (55), that 


ip'> = rp'i>>4'* ( 56 ) 

The connexion between |p'> and |p'D> is like the connexion between 
the basic kets when one changes the weight function of the representa¬ 
tion, as shown by (38) of § 16. 

With n' P ’ photons in each discrete photon state p', the Gibbs 
density p for the assembly of photons is, according to (68) of § 33, 

P = 2 |P'D>Wp'<p'D | = f |p'D)^<p'D|s p .d 3 p' 

P' J 

= / |P>p-<P'l d* p' (57) 

with the help of (56). The number of photons per unit volume in the 
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neighbourhood of any point x' is then <x'|p|x'>, according to (73) 
of § 33. From (57) this equals 

<x'|p|x'> = J <x'|p'>n' p .<p'|x'> d 3 p' 

— (" h~ z n' P 'd 3 p' (58) 

if one puts in the value of the transformation function <x'|p') given 
by (54) of § 23. Equation (58) expresses the number of photons per 
unit volume as an integral over the momentum space, so the inte¬ 
grand in (58) can be interpreted as the number of photons per unit 
of phase space. We obtain in this way the result that the number of 
photons per unit of phase space is equal to A” 3 times the number of 
photons per discrete state, in other words, a cell of volume A 3 in phase 
space is equivalent to a discrete state . This result is a general one, 
holding for any kind of particle. If the polarization variable of the 
photons is not neglected, the result holds for each of the two indepen¬ 
dent states of polarization. 

The momentum of a photon of frequency v is of magnitude hv[c , 
so the element of momentum space 


dp x dpydpz = A 3 c _3 v 2 dvdco, 


dco being an element of solid angle for the direction of the vector p. 
Thus a distribution of photons with n p per discrete state, which is 
equivalent to a distribution of h~ z n p d z pd z x photons in an element 
of volume d z x and an element of momentum space d z p, equals a 
distribution of n p c- z v 2 dvdcod z x photons in an element of volume d z x 
and a frequency range dv and direction of motion dau. This corre¬ 
sponds to an energy density n p hc~ z v z per unit solid angle per unit 
frequency range, or an intensity per unit frequency range (i.e. an 
energy crossing unit area per unit time per unit frequency range) of 


amount 


I v = n' p hv z jc 2 . 


(59) 


The result that the probability of a photon being emitted is pro¬ 
portional to n pl ~\~l, n pl being the number of photons initially present 
in the discrete state concerned, can now be interpreted as the proba¬ 
bility being proportional to / vl +Av 3 /c 2 , where I vl is the intensity of 
the incident radiation per unit frequency range in the neighbourhood 
of the frequency of the emitted photon and having the same polariza¬ 
tion 1 as the emitted photon. Thus with no incident radiation there 
is still a certain amount of emission, but the emission is increased or 
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stimulated by incident radiation in the same direction and having the 
same frequency and polarization as the emitted radiation. The 
present theory of radiation thus completes the imperfect one of § 45 
by giving both stimulated and spontaneous emission. The ratio it 
gives for the two kinds of emission, namely I vl : hv z jc 2 , is in agreement 
with that provided by Einstein’s theory of statistical equilibrium 
mentioned in § 45. 

The probability of a photon being scattered from the state pT to 
the state p'T is proportional to the n’s being the 

numbers of photons initially in the discrete states concerned. We can 
interpret this result as the probability being proportional to 

m 

Similarly for a more general radiative process in which several 
photons are emitted and absorbed, the probability is proportional 
to a factor I vl for each absorbed photon and a factor I vl -\-hv^/c 2 for 
each emitted photon. Thus the process is stimulated by incident 
radiation in the same direction and with the same frequency and 
polarization as any of the emitted photons. 

63. The interaction energy between photons and an atom 

We shall now determine the interaction energy between an atom 
and an assembly of photons, i.e. the H Q of equation (53), from 
analogy with the classical expression for the interaction energy 
between an atom and a field of radiation. For simplicity we shall 
suppose the atom to consist of a single electron moving in an electro¬ 
static field of force. The field of radiation may be described by a 
scalar and a vector potential. These potentials are to a certain extent 
arbitrary and may be chosen so that the scalar potential vanishes. 
The field is then completely described by the vector potential A x> A yi 
A z , or A. The change that the field causes in the Hamiltonian 
describing the atom is now, as explained at the beginning of § 41, 

H=i <p ' a, + 2^ a! - (6i » 

This is the classical interaction energy. The A that occurs here should 
be the value of the vector potential at the point where the electron is 
momentarily situated. It is, however, a good enough approximation 
if we take this A to be the vector potential at some fixed point in the 
atom, such as the nucleus, provided we are dealing with radiation 
whose wavelength is large compared with the dimensions of the atom. 
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Let us first consider the field of radiation classically and ignore its 
interaction with the atom. The vector potential A satisfies, according 
to Maxwell’s theory, the equations 

□ A = 0, divA = 0, (62) 

□ being short for d 2 /c 2 dt 2 —d 2 ldx 2 —d 2 /dy 2 —d 2 /dz 2 . The first of these 
equations shows that A can be resolved into Fourier components in 
the form 

A = J {A t d 3 k, (63) 

each Fourier component representing a train of waves moving with 
the velocity of light, described by a vector k whose direction gives 
the direction of motion of the waves and whose magnitude |k| is 
connected with their frequency v k by 

2w k = c|k|. (64) 

The vector k is just the momentum of a photon which the quantum 
theory would associate with these waves, divided by H. For each 
value of k we have an amplitude A kJ which is in general a complex 
vector, and the integral in (63) extends over the whole of the three- 
dimensional k-space. The second of equations (62) gives 

(k, A k ) = 0, (65) 

showing that for each value of k, A k is perpendicular to k. This 
expresses that the waves are transverse waves. A k is determined by 
its two components in two directions perpendicular to each other and 
to k, these two components corresponding to two independent states 
of linear polarization. 

The total energy of the radiation is given by the volume integral 

H r = (8tt)-i J {&+&*) d 3 x (66) 

taken over the whole of space, where the electric field £ and the 

magnetic field M of the radiation are given by 

£ — —- = curl A. (67) 

c at 

Using standard formulas of vector analysis, we have 

div[Ax»#] = (*#, curl A)—(A, curl &) = M 2 — (A, curl curl A) 

= 4# 2 +(A,V 2 A) 

with the help of the second of equations (62). Thus (66) becomes, 
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with neglect of a term which can be transformed to a surface integral 
at infinity, 

a * - w j w 

By substituting for A here its value given by (63), we can get the 
energy of the radiation in terms of the Fourier amplitudes A k . The 
energy of the radiation is constant (since we are now ignoring the 
interaction of the radiation and the atom), so in this calculation we 
may take t — 0. This means taking 

A = J (At+ALje-^ d s k, (69) 

V 2 A = - J k 2 (A k -f A_ k )e -i(kx) d 3 k, 

8A/dt = ic J |k|(A k -A_ k )e-« k3t > d s k. (70) 

Inserting these expressions in (68), we get 
H r = (tor)-* Iff {k' 2 (A k +A_ k , A k ,+A_ k 0- 

- |k| jk'|(A k —A_ k , A k —X_ k .)}e- i(kx) e- i<k ' x) d 3 kd s k'd 3 x 
— 77 2 J*J {k' 2 (A k -f-A_ k , A k <-f-A_ k <) — 

— |k| Jk'j(A k —A_ k , A k —A_ k -)}8(k-f k') d 3 kd 3 k% 

with the help of formula (49) of § 23, S(k+k') being the product of 
three factors, one for each component of k. Hence 

= k 2 {(A k +A_ kJ A_ k +A k )-(A k -A_ k , A_ k -X k )} d 3 k 
= 2 tt 2 j k 2 {(A k> A k )+(A_ k ,X_ t )Kk 


= 4tt 2 J k 2 (A k; A k )#k. (71) 

We can replace the continuous distribution of k-values by a dust of 
discrete k-values, t like we did with the p-values in the preceding 
section. The integral (71) then goes over, according to formula (52), 


into the sum 


h b = 4?r 2 2 k 2 (A k , AJ^ 1 , 


5 k being the density of the discrete k-values. We may also write 
this as H r = 4tt 2 g (72) 

A kl being a component of A k in a direction 1 perpendicular to k and 

3595.57 
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the summation with respect to 1 referring to two directions 1 perpen¬ 
dicular to each other. Thus there is one term in (72) for each inde¬ 
pendent stationary state for a photon. 

The field quantities £ and at any point x can he looked upon 
as dynamical variables. The quantities 

A kli = A klt = 

are then dynamical variables at time t , since they are connected with 
£ and M at various points x at time t by equations which do not 
involve t, as follows from (63) and (67). A kl is constant, so A kU varies 
with t according to the simple harmonic law. Thus A kU is like the rj t 
of a harmonic oscillator, defined by (3) of § 34, the to of the oscillator 
being iv k . We may take each A klt to be proportional to the r) t of 
some harmonic oscillator and then the field of radiation becomes a 
set of harmonic oscillators. 

Let us now pass over to the quantum theory and take the A kl[ , A klt 
to be dynamical variables in the Heisenberg picture. The expression 
(72) for the energy may be retained unchanged, the order in which 
the factors A kl , A kl there occur being the correct one to give no zero- 
point energy. The A kli then still vary with time according to the t ioii 
law and may still be taken to be proportional to the 77 /s of harmonic 
oscillators. The factor of proportionality may be obtained by equat¬ 
ing (72) to the expression (39) for the energy, with the label a replaced 
by the two labels k and 1 and with hv k for %o) a . This gives 

^ 7r2 X k^-kU^-ki^k 1 ~ Vkli Vklt’ 

kl kl 

the suffix t being inserted to show that we are dealing with Heisenberg 
dynamical variables (as we should when transferring equations of the 
classical theory to the quantum theory). Hence, using (64), 

Air 2 A kll = chh k h 7 k]U 4> (73) 

with neglect of an unimportant arbitrary phase factor. In this way 
the Heisenberg dynamical variables r] kli , which describe the field of 
radiation as a set of oscillators, are introduced. The commutation 
relations between the 77 ^ and fj ku are known, being given by (II), so 
equation (73) fixes the commutation relations between the A kU and 
A kli . It thus fixes the commutation relations between the potentials 
A and the field quantities £ and M at various points x at the time t. 
(Incidentally, the commutation relations of the A kv A kl are fixed, 
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so the commutation relation of two potential or field quantities at 
two different times is also fixed.) 

We can still use (73) when the interaction between the field of 
radiation and the atom is taken into account. This involves ass umin g 
that the interaction does not affect the commutation relations 
between the potentials and field quantities at a given time. The 
interaction causes the ^ kK ’s to cease to vary according to the simple 
harmonic law and the oscillators to cease to be harmonic. Thus it 
may affect the commutation relation between two potential or field 
quantities at two different times. 

We can now take over the interaction energy (61) into the quantum 
theory, putting for p to show it is a Heisenberg dynamical variable. 
Taking the atomic nucleus to be at the origin we get, by substituting 
(63) with x = 0 into (61), 

= f (p <i A k( +A w )^k+ 

TrlC J 

+ 2^1 JJ (A w +A Kj A k7 +A k ,) d»kd*k' 

= — ^ A w +A w )s k 1 -f- A k7 -j- A k > t )s k 1 ^ 1 

k kk' 

if we pass from continuous to discrete k-values. Thus 
Hqt = — 2 iPk(^kw+^ki/)' 9 k 1 + 


+ 


2mc 2 


(^kU+^ktf)(^kT£^^kT/)(^) 5 k ^k'S 

kkn 


p u being the component of p* in the direction 1. With the help of (73) 
we may express H Qt in terms of the r] kl£ and rj kU , and we can then drop 
the suffix t (which means going over to Schrodinger dynamical 
variables), so that we obtain finally 


2#i i 'k-(’?ki+^ki)- s k i + 


_fh_ y 


v k -Vk-Hvki +^ki)( 1 ?kT+^kr)(k')«k is k‘ i - ( 74 ) 


With the model of the atom we are using, the interaction energy 
appears as a linear plus a quadratic function in the tj’s and t?’s. The 
linear terms give rise to emission and absorption processes, the 



244 


THEORY OF RADIATION 


§ 63 


quadratic ones to scattering processes and processes in which two 
photons are absorbed or emitted simultaneously. The order of the 
factors rj and rj in the quadratic terms is not determined by the 
procedure of working from the classical theory, but this order is 
unimportant, since a change in it merely changes H Q by a constant. 

The matrix element of H Q referring to the emission of a photon 
into the discrete state kl, or into the discrete state pi, as it may also 
be labelled, with the atom jumping from state a 0 to state a', is 


<pW|ir 0 i«®> 


eh * 

47 7 2 mv'~ 




e 

mh(2TTv , ) i 


<a'|p,|a 0 >Spi 


since s k == s p h 3 . The p x occurring here, referring to the momentum 
of the electron, is, of course, quite distinct from the other letters p, 
referring to the momentum of the emitted photon. To avoid con¬ 
fusion we shall replace the electron momentum p by mx, these two 
dynamical variables being the same for the unperturbed atom. Pass¬ 
ing over to continuous photon states by means of the conjugate 
imaginary of equation (56), we get 


<p , la'|-H Q |a°> = ^2^7j 7 i<“ , |^ , l“ 0 >' ( 75 ) 

Similarly, the matrix element of H Q referring to the absorption of a 
photon from the continuous state p°l with the atom jumping from 
state a 0 to state a' is 


W\H q |p°l* 0 > 


e 

h(27TV°)t 


<a'|^ 1 |a 0 >, 


(76) 


and the matrix element referring to the scattering of a photon from 
the continuous state p°l° to the continuous state p'T with the atom 
jumping from state a? to state a! is 


<piv|£r 0 | P qv> 


2t riWV** 110 ) S “'“°’ 


(77) 


there being two terms in (74) which contribute to it. These matrix 
elements will be used in the next section. The matrix elements 
referring to the simultaneous absorption or emission of two photons 
may be written down in the same way, but they lead to physical 
effects too small to be of practical importance. 


64. Emission, absorption, and scattering of radiation 

We can now determine directly the coefficients of emission, absorp¬ 
tion, and scattering of radiation by substituting in the formulas of 
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Chapter VIII the values for the matrix elements given by (75), (76), 
and (77). 

For determining the emission probability we can use formula 
(56) of § 53. This shows that for an atom in a state a 0 the proba¬ 
bility per unit time per unit solid angle of its spontaneously emitting 
a photon and dropping to a state a' of lower energy is 




(78) 


4 tt 2 WP 

h j h ( 2nv ■)* 

Now the energy and momentum of a photon of frequency v are 
W = hv, P = hv/c . 

Again, from the Heisenberg law (20) of § 29, 

<a'|aq|a 0 > = — 27 riv(a°a: , )<(a / | 2 : 1 |a 0 >, 

y(a°a') being the frequency connected with transitions from state a 0 
to state a', which in the present case is just the frequency v of the 
emitted radiation. These results substituted in (78) make the emis¬ 
sion coefficient reduce to 


(2irvf 

Ac 3 


|<a / |e^ 1 |a'°>| 2 . 


(79) 


To obtain the rate of emission of energy per unit solid angle for a 
specified polarization, we must multiply this by hv. This gives for 
the total rate of emission of energy in all directions 


4 (2 TTV)' 
3 c 3 


|<a'|ex|a: 0 >| 2 , 


(80) 


which is in agreement with expression (34) of § 45 and justifies Heisen¬ 
berg’s assumption for the interpretation of his matrix elements. 

In the same way the absorption coefficient, given by formula 
(59) of § 53, becomes for photons 


47T 2 A 2 1F / , ov 8tt 2 v . | 0\io 

c 2 P |M2w) J< “ KI > 

This absorption coefficient refers to an incident beam of one photon 
crossing unit area per unit time per unit energy range. If we take 
one per unit frequency range instead of energy range, as is usual 
when dealing with radiation, the absorption coefficient becomes 


^\W\ex x m*. 

This result is the same as (32) of § 45, if we substitute for the E v 
there the energy hv of a single photon. Thus the elementary theory 
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of § 45, in which the radiation field is treated as an external perturba¬ 
tion, gives the correct value for the absorption coefficient . 

This agreement between the elementary theory and the present 
theory could he inferred from general arguments. The two theories 
differ only in that the field quantities all commute with one another 
in the elementary theory and satisfy definite commutation relations 
in the present theory, and this difference becomes unimportant for 
strong fields. Thus the two theories must give the same absorption 
and emission when strong fields are concerned. Since both theories 
give the rate of absorption proportional to the intensity of the inci¬ 
dent beam, the agreement must hold also for weak fields in the case of 
absorption. In the same way the stimulated part of the emission in the 
present theory must agree with the emission in the elementary theory. 

Let us now consider scattering. The direct scattering coefficient is 
given by formula (38) of § 50. Such scattering of photons will not be 
accompanied by any change of state of the atom on account of the 
factor S a , a o in the expression for the matrix element (77). Thus the 
final energy W' of the photon will equal its initial energy W°. The 
scattering coefficient now reduces to 

e 4 /m 2 c 4 .( I! 0 ) 2 . 

This is the same as that given by classical mechanics for the scattering 
of radiation by a free electron. We thus see that the direct scatter¬ 
ing of radiation by an electron in an atom is independent of the atom 
and is correctly given by the classical theory. This result, it should 
be remembered, holds only provided the wavelength of the radiation 
is large compared with the dimensions of the atom. 

The direct scattering is a mathematical concept and cannot be 
separated out experimentally from the total scattering, given by 
formula (44) of § 51. Let us see what this total scattering is in the 
case of photons. We must be careful in our application of formula 
(44) of § 51. The summation in this formula may be considered as 

k 

representing the contribution to the scattering of double transitions 
consisting of transitions firstly from the initial state to state h and 
secondly from state k to the final state. The first transition may be 
an absorption of the incident photon and the second an emission of 
the required scattered photon, but it is also possible for the first 
transition to be the emission and the second the absorption. It is 
clear from the general nature of the method used for deriving formula 
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(44) of § 51 that both these kinds of double transitions must be in¬ 
cluded in the summation 2 when this formula is applied to photons, 

k 

although only the first of them appears in the actual derivation given 
in § 51, as the possibility of the particle being created or annihilated 
was not taken into account there. 

We use zero, single prime, and double prime to refer to the initial, 
final, and intermediate states of the atom respectively, and zero and 
single prime to refer to the absorbed and emitted photons respec¬ 
tively. Then, for the double transition of absorption followed by 
emission, we must take for the matrix elements 

<fc|F|p 0 a 0 >, <pV|F|i> 

of the formula (44) of § 51 

<A|F|pV> = W\H q |p°l°a°>, <pV|F|Jfe> = <p'lV|^ Q |a">. 

Also E f -E k = hv Q +H P {oc 0 )-H P (a") = h[v°-v(a" a 0 )], 
where hv( a"a°) = H P (oc")-H p (ot% 

Similarly, for the double transition of emission followed by absorption 
we must take 

<&|F|p°a°> = <plV|£^|a°>, < P V|F|^ = W\H Q \p°l»«''y 

and 

E'~E k = hifl+H P (afl)-H P (ar)-kifi-ki>' = -~h[v'+v{*"ot% 

there being now two photons, of frequencies v° and v', in existence 
for the intermediate state. Substituting in (44) of § 51 the values of 
the matrix elements given by (75), (76), and (77), we get for the 
scattering coefficient 


7i 2 c 4 v°\ m 


-“(l'l°)8 a , flt „+ 


, f<ot'|a&rIo£"><Q:"i^ 1 o|ai 0 > <a'|a 1 .|O<“"|£rl“ 0 > / QT i 

+ 2L, [ ' ' 


-v(a"a.°) 


v'+v(a"a°) 


If we write (81) in terms of x instead of x, we get 

( 2 ire ) 4 H /t/in\s V t ' »\ i " o\ (<“ 1 % 

, 0 - (1 1 °) — > K“ “ M« a°) - ~o -r^-m- 

ft 2 C 4 v° 2 -rrm I v u —via cr) 


<a' lx,. | a"> <a" ja;,- [ a°> 


om 2 

->J . (82) 


v' -\-v{a” ofi) 

We can simplify (82) with the help of the quantum conditions. 

We haVe XyXy—XtfXv = 0 , 
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which gives 

2 {< a '|a; r la''><ancc 1 .| a 0 >-<a> 1 .|a'><a:''|a;rh 0 >} = 0, (83) 

a* 

and also 

XvXy—XyttXy = #io#i') — m/m.(11 ), 

which gives 

^ {<«'|* r |a"> . vfcVKa" \x v [a 0 )-v(aV)<a'I*,.|a">. <a"M“°>} 
Multiplying (83) by v and adding to (84), we obtain 

2{<a'|a; r |a''><ancc 1 .[a 0 >[v'+v(a' , a 0 )]-<a'|a7 1 o!o!''><Q:'> r |o i 0 >[v'+Ka'“'')]} 

a * = Hj2nm, (11°) S a ' a o- 

If we substitute this expression for Hjlnm . (11°) S a - a o in (82), we 
obtain, after a straightforward reduction making use of identical 
relations between the v’s, 

( 27 re) 4 „ J Kot'l^i-la"><^"l^i»la! 0 > <ot'|x 1 o|ot"><a |a: r |a 0 >'(| 2 

1M VV |2\-HW) v+v(«V) II' 

This gives the scattering coefficient in the form of the effective 
area that a photon has to hit per unit solid angle of scattering. It is 
known as the Kramers-Heisenberg dispersion formula, having been first 
obtained by these authors from analogies with the classical theory 
of dispersion. 

The fact that the various terms in (82) can be combined to give 
the result (85) justifies the assumption made in deriving formula (44) 
of §51, that the matrix elements <p'a'|F[p"a"> of the interaction 
energy are of the second order of smallness compared with the 
<pV |F|fc> ones, at any rate when the scattered particles are photons. 

65. An assembly of fermions 

An assembly of fermions can be treated by a method similar to 
that used in §§ 59 and 60 for bosons. With the kets (1) we may use 
the antisymmetrizing operator A defined by 

4 = < 2 ') 

summed over all permutations P, the + or — sign being taken 
according to whether P is even or odd. Applied to the ket (1) it gives 
m'H 2 ±P|ct?c4og...c&> = A\a a OL h a?...a?'), (3') 

a ket corresponding to a state for an assembly of u’ fermions. The 
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ket (3') is normalized provided the individual fermionkets |a a >, | a & >,... 
are all different, otherwise it is zero. In this respect the ket (3') is 
simpler than the ket (3). However, (3') is more complicated than (3) 
in that (3') depends on the order in which a a , a b , of,... occur in it, 
being subject to a change of sign if an odd permutation is applied 
to this order. 

We can, as before, introduce the numbers n v n 2 , n 3) ... of fermions 
in the states a (1) , a (2) , a (3) ,... and treat them as dynamical variables or 
observables. They each have as eigenvalues only 0 and 1. They form 
a complete set of commuting observables for the assembly of fermions. 
The basic kets of a representation with the n’s diagonal may be taken 
to be connected with the kets (3') by the equation 

A\oi a oi b a c ...a a ) = ± n' 2 n a ...y (6') 

corresponding to (6), the n n s being connected with the variables 
a a , <x b , of ... by equation (4). The ± sign is needed in (6') since, for 
given n” s, the occupied states oc a 3 of, of,... are fixed but not their 
order, so that the sign of the left-hand side of (6) is not fixed. To 
set up a rule which determines the sign in (6'), we must arrange all 
the states a for a fermion arbitrarily in some standard order. The 
a’s occurring in the left-hand side of (6') form a certain selection from 
all the a’s and the standard order for all the a’s will give a standard 
order for this selection. We now make the rule that the + sign should 
occur in (6') if the a’s on the left-hand side can be brought into their 
standard order by an even permutation and the — sign if an odd 
permutation is required. Owing to the complexity of this rule, 
the representation with the basic kets J n[n 2 n z ...} is not a very 
useful one. 

If the number of fermions in the assembly is variable, we can set 
up the complete set of kets 

|>, |a a >, A ja a a b X A|a a a 6 a c >, •••> (9') 

corresponding to (9). A general ket is now expressible as a sum of 
the various kets (9'). 

To continue with the development we introduce a set of linear 
operators tj, i ?, one pair v) a , rj a corresponding to each fermion state of 1 , 
satisfying the commutation relations 

VaVb+VbVa = °> 

VaVb+Vb^a = °> L 11 ') 

^aVb+VbVa = S ab- 
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These relations are like (11) with a + sign instead of a — on the left- 
hand side. They show r that, for a =£ b, rj a and rj a anticommute with 
rj b and rj bi while, putting b = a. they give 

vl = °> vk = °> VaVa+VaVa = (H") 

To verify that the relations (IT) are consistent, we note that linear 
operators rj, rj satisfying the conditions (11') can be constructed in 
the following way. For each state a a we take a set of linear operators 
o xa , a ya , a za like the c x , o y3 a z introduced in § 37 to describe the spin 
of an electron and such that o xai , cr ya) o za commute with a xbi cr yb , <y zb 
for b a. We also take an independent set of linear operators £ a , 
one for each state a a , which all anticommute with one another and 
have their squares unity, and commute with all the a variables. 
Then, putting 

Va = i^ai^xa ya)> Va ^ 
we have all the conditions (IT) satisfied. 

From (IT) 

(VaVa ) 2 = VaVaVaVa = Va( l — VaVa)Va = Vala- 
This is an algebraic equation for rj a rj ai showing that y\ a rj a is an 
observable wdth the eigenvalues 0 and 1. Also rj a rj a commutes with 
7] b rj h for b ^ a. These results allow us to put 

VaVa = K a > (12') 

the same as (12). From (11") we get now 

JlaVa 1 (13 ) 

s equation correspond to (13). 

Let us write the norm ;ed ket which is an eigenket of all the n’s 

zero as } A . Then 
n a >A = 0, 

Va Va^A ^ 

Va>A= 0, (15') 

4.(1 ^a)>A ^ (a)a — 1> 

1, and 

. Vai 1 ~ ^?aX<s(.J 

et of n a belonging to the eigenvalue 
> other n’s belonging to the eigenvalues 
nute wdth rj a . By generalizing the 


belonging to the eigenva 

so from (12') 

Hence 

like (15). Again 

<A Vala 

showing that rj a y^ 

showing that 
unity. It is an eigl 
zero, since the otf 
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argument we see that 7j a rj b r) c ...r] {J } A is normalized and is a simul¬ 
taneous eigenket of all the n’ s, belonging to the eigenvalues unity 
for 7i a , % I, ti c Tig and zero for the other n’s. This enables us to put 

A |aW... a ff> = Va Vb r, e ... % } A , ( 17 ') 

both sides being antisymmetrical in the labels a, b, c,..., g. We have 
here the analogue of (17). 

If we pass over to a different set of basic kets ])3 4 > for a fermion, 
we can introduce a new set of linear operators rj A corresponding to 
them. We then find, by the same argument as in the case of bosons, 
that the new rj’s are connected with the original ones by (21). This 
shows that there is a procedure of second quantization for fermions, 
similar to that for bosons, with the only difference that the commu¬ 
tation relations (11') must be employed for fermions to replace the 
commutation relations (11) for bosons. 

A symmetrical linear operator U T of the form (22) can be expressed 
in terms of the p, -i) variables by a similar method to that used for 
bosons. Equation (24) still holds, and so does (25) with S replaced 
by A. Instead of (26) we now have 

Vx,->A = 21(-Y~ 1 Va 7 lx r lr lx l Vx,-yA <«l C/ >r> 

a r 

= 2 Va2(-Y~ lr lx?"0x l ' r )x !1 ->ASbxr< a \ U \ b '>’ ( 26 ') 

ab r 

} meaning that the factor y\ Xr must be cancelled out, without its 
position among the other t\ x s being changed before the cancellation. 
Instead of (27) we have 

Vb Vxt Vx,->A = 2 (-Y~ 1 ^Vx l Vx,->A b bxA ( 27 ') 

r 

so (28) holds with ) A for ) s and thus (29) holds unchanged. We have 
the same final form (29) for U T in the fermion case as in the boson 
case. Similarly, a symmetrical linear operator V T of the form (30) can 
be expressed as ^ = j % ^ (35') 

abed 

the same as one of the ways of writing (35). 

The foregoing work shows that there is a deep-seated analogy 
between the theory of fermions and that of bosons, only slight 
changes having to be made in the general equations of the formalism 
when one passes from one to the other. 
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RELATIVISTIC THEORY OF THE ELECTRON 
66. Relativistic treatment of a particle 

The theory we have been building up so far is essentially a non- 
relativistic one. We have been working all the time with one par¬ 
ticular Lorentz frame of reference and have set up the theory as an 
analogue of the classical non-relativistic dynamics. Let us now try 
to make the theory invariant under Lorentz transformations, so that 
it conforms to the special principle of relativity. 

In the first place we note that the general principle of superposi¬ 
tion of states, as given in Chapter I, is a relativistic principle. It 
applies to ‘states’ with the relativistic space-time meaning. Beyond 
this, though, the theory does not lend itself very well to relativistic 
treatment, owing to the fundamental notion of an ‘observable’ not 
fitting in very well with the requirements of relativity. The measure¬ 
ment of an observable, in the theory we have been dealing with up 
to the present, has always consisted in the measurement of some 
dynamical variable at some instant of time in some Lorentz frame 
of reference and there does not seem to be any very natural way of 
generalizing this notion of an observable to make it cease to refer to 
a particular Lorentz frame. In consequence one cannot set up a 
scheme of relativistic quantum mechanics with the same degree of 
generality as the non-relativistic theory. All one can do is to solve 
special problems in a Lorentz-invariant way. This should not be 
regarded as a defect of the quantum theory, since it is in perfect 
analogy with the classical theory. Relativistic classical mechanics 
does not involve any such general scheme as the contact transforma¬ 
tion theory of non-relativistic classical mechanics, but consists in the 
solution of comparatively special problems. 

One of the special problems that can be handled relativistieally is 
that of the motion of a particle in an external field of force. Our non- 
relativistic quantum mechanics applied to this problem can be fitted 
in with the formalism of relativity by a change of notation. We put 
x i> % x s for x, y, z and x 0 for ct, so that the time dependent wave 
function in Schrodinger’s representation appears as \fi(x a x x x 2 x^, 
in which the four s’s may be treated on the same footing. We 
write the momentum components as p v p z , p & instead of p x , p y , p z . 
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They satisfy 

1 = (r = 1,2,3). ( 1 ) 

To preserve the symmetry between the four x’s we introduce a 
corresponding linear operator p 0 , equal to the energy divided by c, 
whose effect on ifj is 

= «!£>• ( 2 ) 

The difference in sign in (1) and (2) is required by relativity. 

We treat x 0 and p 0 as dynamical variables on the same footing as 
the other re’s and p’s. They provide a new degree of freedom. The 
standard ket in (1) and (2) must refer to this new degree of freedom 
as well as to the previous ones. The lack of symmetry between the 
treatment of x^ and that of the other x’s in the non-relativistic theory 
may be considered as due to our always using a representation with 
x 0 diagonal and leaving understood the standard ket for the (re 0 p 0 ) 
degree of freedom. It would seem that only representations with x 0 
diagonal are useful in the non-relativistic theory. We may therefore 
expect that in a relativistic theory, which treats all the four x’s on 
the same footing, only representations with the four re’s diagonal will 
be useful. It then becomes convenient to leave understood the stan¬ 
dard ket for all four degrees of freedom and to write any ket as a 
wave function in the four x’s. 

In the theory of the electron that, will be developed here we shall 
have to introduce some further degrees of freedom describing an 
internal motion of the electron. A ket for the whole system will now 
be written as a ket in these further degrees of freedom and a wave 
function in the four x’s, and will appear as \x^x x x^x^), or |.t> for 
brevity, according to the notation explained near the end of § 20. 

67. The wave equation for the electron 

Let us consider first the case of the motion of an electron in the 
absence of an electromagnetic field, so that the problem is simply 
that of the free particle, as dealt with in § 30, with the possible 
addition of internal degrees of freedom. The relativistic Hamiltonian 
provided by classical mechanics for this system is given by equation 
(23) of § 30, and leads to the wave equation 

{p 0 — (m z c 2 -\-pl+p\ +p!)*} \x> = 0, (3) 

where the p’s are to be interpreted as operators in accordance with 
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equations ( 1 ) and ( 2 ). Equation (3), although it takes into account 
the relation between energy and momentum required by relativity, 
is yet unsatisfactory from the point of view of relativistic theory, 
because it is very unsymmetrical between p 0 and the other p’ s, so 
much so that one cannot generalize it in a relativistic way to the 
case when there is a field present. We must therefore look for a new 
w r ave equation. 

If we multiply the wave equation (3) on the left by the operator 
{Po+(^ 2 c 2 +Pi+^|+^i )-} 5 we obtain the equation 

{pg—m 2 c 2 — p\— p|— 4>!}|z> = 0, (4) 

which is of a relativistically invariant form and may therefore more 
conveniently be taken as the basis of a relativistic theory. Equation 
(4) is not completely equivalent to equation (3) since, although every 
solution of (3) is also a solution of (4), the converse is not true. Only 
those solutions of (4) belonging to positive values for p Q are also 
solutions of (3). 

The w r ave equation (4) is not of the form required by the general 
laws of the quantum theory on account of its being quadratic in p 0 . 
In § 27 we deduced from quite general arguments that the wave 
equation must be linear in the operator djdt or p Q , like equation ( 7 ) 
of that section. We therefore seek a wave equation that is linear 
in p 0 and that is roughly equivalent to (4). In order that this wave 
equation shall transform in a simple way under a Lorentz transforma¬ 
tion, we try to arrange that it shall be rational and linear in p x , p 2i 
and p z as well as in p Q) and thus of the form 

{Vo+<xi / Pi+<X' i p z +a. z p a +P}\x'> = 0, (5) 

where the a’s and jS are independent of the p’s. Since we are consider¬ 
ing the case of no field, all points in space-time must be equivalent, 
so that the operator in the wave equation must not involve the afs. 
Thus the a’s and j 8 must also be independent of the x’s, so that they 
must commute with the p’s and the x’s. They therefore describe 
some new degrees of freedom, belonging to some internal motion in 
the electron. We shall see later that they bring in the spin of the 
electron. It is these degrees of freedom to which the ket \x > refers. 

Multiplying (5) by the operator {p 0 —os2> s —“ 3 ^ 3 —$ on the 
left, we obtain 

{Vl~ 2 [ (X il J l+{^i ai 2+^a 1 )iJ 1 po+{<x 1 P+^a 1 )p 1 ]~^ 2 }\x) = 0 , 
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where 2 refers to cyclic permutations of the suffixes 1, 2, 3. This is 

123 

the same as (4) if the a’s and satisfy the relations 

a i = 1> a x = 0, 

P 2 = W 2 C 2 , oqjS+jSoq = 0, 

together with the relations obtained from these by permuting the 
suffixes 1, 2, 3. If we write 

P = a m mc, 

these relations may be summed up in the single one, 

l 0 i v + 0 i v (x IM = ^ (p,v = 1,2,3, or m). (6) 

The four a’s all anticommute with one another and the square of 
each is unity. 

Thus by giving suitable properties to the as and /J we can make 
the wave equation (5) equivalent to (4), in so far as the motion of 
the electron as a whole is concerned. We may now assume (5) is the 
correct relativistic wave equation for the motion of an electron in 
the absence of a field. This gives rise to one difficulty, however, 
owing to the fact that (5), like (4), is not exactly equivalent to (3), 
but allows solutions corresponding to negative as well as positive 
values of p 0 . The former do not, of course, correspond to any actually 
observable motion of an electron. For the present we shall consider 
only the positive-energy solutions and shall leave the discussion of 
the negative-energy ones to § 73. 

We can easily obtain a representation of the four as. They have 
similar algebraic properties to the <r’s introduced in § 37, which as 
can be represented by matrices with two rows and columns. So long 
as we keep to matrices with two rows and columns we cannot get a 
representation of more than three anticommuting quantities, and we 
have to go to four rows and columns to get a representation of the 
four anticommuting ( as. It is convenient first to express the as in 
terms of the o-’s and also of a second similar set of three anticom¬ 
muting variables whose squares are unity, p v p 2 > p z say, that are 
independent of and commute with the as. We may take, amongst 
other possibilities, 

a l = Pi Of? a 2 = Pi a 2> a 3 ^ Pi a 3> °hn “ P3> ( 7 ) 

and the a’s will then satisfy all the relations (6), as may easily be 
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verified. If we now take a representation with p z and cr 3 diagonal, 
we shall get the following scheme of matrices: 


a, = /0 

1 

0 

0\ cr 2 — 

/o- 

-i 

0 

0\ 

*3= /l 

0 

0 

°\ 

1 

0 

0 

0 

i 

0 

0 

0 

1 |°- 

-1 

0 

0 

, 0 

0 

0 

1 

\° 

0 

0 - 

—i 

1 lo 

0 

1 

°l 

Vo 

0 

1 

0/ 

\o 

0 

i 

0/ 

\o 

0 

0 - 

-1/ 


Pi= /o 

0 

1 

0\ p2 — 

f° 

0 - 

-i 

°\ 


fl 

0 

0 

°\ 

0 

0 

0 

1 i 

0 

0 

0 - 

—i 

1 Ps = 1 

0 

1 

0 

0 

1 

0 

0 

0 1 

[* 

0 

0 

0 


0 

0 - 

-1 

°l 

Vo 

1 

0 

0 


i 

0 

0/ 


Vo 

0 

0 - 

-1/ 


Corresponding to the four rows and columns there are four indepen¬ 
dent kets, so that the wave function will have four components. 
We saw in § 37 that the spin of the electron requires the wave 
function to have two components. The fact that our present theory 
gives four is due to our wave equation (5) having twice as many 
solutions as it ought to have, half of them corresponding to states 
of negative energy. 

With the help of (7), the wave equation (5) may be written with 
three-dimensional vector notation 

{Po+Pifo P)+Psmc}\x> = 0. (8) 

To generalize this equation to the case when there is an electro¬ 
magnetic field present, we follow the classical rule of replacing p 0 and 
P by _p 0 + e / c an( l P +e/c. A, A 0 and A being the scalar and vector 
potentials of the field at the place where the electron is. This gives 
us the equation 

|po+^o+Pi|o>P+^Aj+p 3 mc||x> = 0, (9) 

which is the fundamental wave equation of the relativistic theory of 
the electron. The conjugate imaginary equation is 

<*l{po+^o+ft(«.P+^Aj+/>smc} = 0 (10) 

in which the operators p operate to the left. An operator of differen¬ 
tiation operating to the left must be interpreted according to (24) of 
§ 22 . 
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68. Invariance under a Lorentz transformation 

Before proceeding to discuss the physical consequences of the wave 
equation (9) or (10), we shall first verify that our theory really is 
invariant under a Lorentz transformation, or, stated more accurately, 
that the physical results the theory leads to are independent of the 
Lorentz frame of reference used. This is not by any means obvious 
from the form of the wave equation (9). We have to verify that, if 
we write down the wave equation in a different Lorentz frame, the 
solutions of the new wave equation may be put into one-one corre¬ 
spondence with those of the original one in such a way that corre¬ 
sponding solutions may be assumed to represent the same state. For 
either Lorentz frame, the square of the length of the ket |a?> should 
give the probability per unit volume of the electron being at the place 
x in that Lorentz frame. We may call this the probability density. Its 
values, calculated in different Lorentz frames for wave functions 
representing the same state, should be connected like the time com¬ 
ponents in these frames of some 4-vector. Further, the 4-dimensional 
divergence of this 4-vector should vanish, signifying conservation of 
the electron, or that the electron cannot appear or disappear in any 
volume without passing through the boundary. 

For discussing Lorentz transformations it is convenient to make 
the convention that terms containing a repeated suffix are to be 
summed over the values 0, 1, 2, 3 for that suffix. This enables us to 
write equation (9) in the form 

{°‘p{ 2 >p+e/c.A l J+<x m mc} |z> = 0, (11) 

a 0 being equal to unity, and similarly we can write equation (10) in 
the form <x\{a.^(p^ejc = 0. (12) 

We now apply a Lorentz transformation and denote quantities 
referring to the new frame by a star. The components of the 4-vectors 
p and A will transform according to a linear law of the type 

JPn A li = a llv A*. (IS) 

Substituting these expressions for p^ and in equations (11) and 
(12), we obtain 

{^% v {lPt+^lc.A*)+oc m mc}\xy = 0 
and <z.| {<* F a^ipf+e/c. A*)+a M me} = 0. 

We now try to bring these equations back to the form of the original 
(11) and (12) by making a transformation 

)**> = y|a;> 

S 



3595-57 


( 15 ) 
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where y is a linear operator in the internal degrees of freedom and is 
independent of the ®’s and p’s. The conjugate imaginary equation 

to < 1S ) is <®*| = <®|y. (16) 

Equations (14) will go over into the equations 

yK(P?+e/c.^)+ot m mc}|®*> = 0 
and {x*\{a v {pf+e/c .A*)+<x m mc}y — 0 


provided we can choose y such that 

ya„y = a^a^, ya m y = a m . (18) 

These equations (17) are of the same form as (11) and (12), as re¬ 
quired, since one can divide out by the extra factors y and y. The 
transformation given by (15), (16), and (18) is something like a 
unitary transformation, but is more general since y does not satisfy 
the unitary condition. 

In order to verify that we can choose y to satisfy the equations 
(18), let us first take the special case when the change of our frame 
of reference consists simply of a rotation through a hyperbolic angle 
6 in the x^-plane, so that the transformation equations for the 
components of a 4-vector are of the type 

Po ^ Po c °sh sinh 

Pi ~ Po Q+P* cos ^ } ( 19 ) 

Pi = P* , Pz = Pz-• 


The values of the a^ v may be written down at once from a comparison 
of these equations with (13). With these values for the it is easy 
to see that equations (18) hold when we take 

y = = y. (20) 

We have, in fact, 


ya 0 y = yy = e® ai 


= l+fe 1 +^f/2!+6 3 a ?/3!+.... 

On account of af = 1, this reduces to 

ya 0 y = {l+fi»/2!+„.}+a ]L {fl+fl»/SI + ...} 
= cosh 0+cq sinh 6 
— a 0 cosh tf-j-cq sinh 6. 

Again, ycq y = aq yy = oc 0 sinh 6 +aq cosh 6 . 

Further, ya 2 y = e i6oi *oc 2 e idoc ^ = = a 2 , 
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since a 2 anticommutes with oc v which results in a 2 /(a x ) ==/(—a 1 )a 2 
for any function /(ax) of oc v Similarly, 

ya 3 y = oc 3 , ya m y = a m . 

Thus the five equations (18) hold with y given by (20) when the a^ 
are given by (19). 

As a second typical change of the frame of reference, w T e may con¬ 
sider a rotation through an angle 9 in ordinary space about the a^-axis. 
The transformation equations are now 

Po = P*. Pi = Pt, 
p 2 = p%cos9+p%sin6, 
p 3 = —p* sin 0+_p* cos 9. 

With the new values for the a^ v we can easily verify that equations 
(18) hold with 

y = y z=l a 2 = a 3^ 

the analysis being very similar to the preceding case. 

If two changes of the frame of reference are made consecutively, 
we simply have to multiply the corresponding y’s to get the y for 
the resultant change. Now any change of the frame of reference may 
be built up from two rotations of the types we have considered, and 
hence there will always be a y satisfying (18). 

In this way we see that the solutions of the wave equations in the 
new frame of reference, equations (17), can be put into a natural one- 
one correspondence with those of the original wave equations (11) 
and (12), corresponding solutions being connected by (15) and (16), 
and we may assume that corresponding solutions represent the same 
state. It remains for us to verify that the probability density trans¬ 
forms like the time component of a 4-vector and that the divergence 
of this 4-vector vanishes. 

The probability density is <, x\x > = <x\oc 0 \x} since a 0 = I. Let us 
see how the four quantities with fi = 0, 1, 2, 3, transform 

under a Lorentz transformation. We have, from (15), (16), and (18), 

<£*|a v |a*> = <p\ yoL v y\x> == = (xloc^xya^. 

Comparing this result with (13), we see that the four quantities 
(xloc^lx} transform like the covariant components of a 4-vector (as 
defined in § 74). The contra variant components will be 

<#|#>, —<rrjo: x |a:>, —<x|a 2 |o;>, — <x\oc 8 \xy. (21) 
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This verifies that the probability density <a:|*> is the time component 
of a 4-vector and that the corresponding space components are 
—(x\a r \x) (with r — 1, 2, 3). These space components multiplied by 
the factor c give the probability current, or the probability of the 
electron crossing unit area per unit time. 

The divergence of the 4-vector is 


2±^-<*KJ*X ( 22 ) 

where the i sign means that the + sign is to be taken for p — 0 
and the — sign for (jl = 1, 2, 3 before one does the summation. To 
prove this divergence vanishes, multiply equation (11) by (x\ on the 
left and (12) by \x} on the right and subtract. The result is 

the dots denoting that p^ operates to the right on \x} in the first 
term and to the left on (x\ in the second. With the help of (1) and 
(2) and the interpretation (24) of § 22 for operators of differentiation 
operating to the left, this gives 



dx„ 



= 0 , 


which just expresses the vanishing of (22). In this way we complete 
the proof that our theory gives consistent results in whichever frame 
of reference it is applied. 


69. The motion of a, free electron 

It is of interest to consider the motion of a free electron in the 
above theory according to the Heisenberg picture and to study the 
Heisenberg equations of motion. These equations of motion can be 
integrated exactly, as was first done by Schrodinger.f For brevity 
we shall omit the suffix t which the notation of § 28 requires to be 
inserted in dynamical variables that vary with time in the Heisen¬ 
berg picture. 

As Hamiltonian we must take the expression which we get as equal 
to cp 0 when we put the operator on \x > in (8) equal to zero, i.e. 

H = —cp^o, p)—p 3 mc 2 = — c(a, p)— p z mc 2 , (23) 


t Schrodinger, Sitzungsb . d. Berlin Ahad 1930, p. 418. 
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We see at once that the momentum commutes with H and is thus a 
constant of the motion. Further, the a^-eomponent of the velocity is 

*1 = [&!,#] = —Ca v (24) 

This result is rather surprising, as it means an altogether different 
relation between velocity and momentum from what one has in 
classical mechanics. It is connected, however, with the expressions 
(21) for the probability density and current. The x x given by (24) 
has as eigenvalues ±c, corresponding to the eigenvalues ±1 of a v 
As x 2 and x 3 are similar, we can conclude that a measurement of a com¬ 
ponent of the velocity of a free electron is certain to lead to the result ±c. 
This conclusion is easily seen to hold also when there is a field present. 

Since electrons are observed in practice to have velocities con¬ 
siderably less than that of light, it would seem that we have here a 
contradiction with experiment. The contradiction is not real, though, 
since the theoretical velocity in the above conclusion is the velocity 
at one instant of time while observed velocities are always average 
velocities through appreciable time intervals. We shall find upon 
further examination of the equations of motion that the velocity is 
not at all constant, but oscillates rapidly about a mean value which 
agrees with the observed value. 

It may easily be verified that a measurement of a component of the 
velocity must lead to the result in a relativistic theory, simply 
from an elementary application of the principle of uncertainty of 
§ 24. To measure the velocity we must measure the position at two 
slightly different times and then divide the change of position by the 
time interval. (It will not do to measure the momentum and apply 
a formula, as the ordinary connexion between velocity and momen¬ 
tum is not valid.) In order that our measured velocity may approxi¬ 
mate to the instantaneous velocity, the time interval between the 
two measurements of position must be very short and hence these 
measurements must, be very accurate. The great accuracy with 
which the position of the electron is known during the time-interval 
must give rise, according to the principle of uncertainty, to an almost 
complete indeterminacy in its momentum. This means that almost 
all values of the momentum are equally probable, so that the momen¬ 
tum is almost certain to be infinite. An infinite value for a component 
of momentum corresponds to the value for the corresponding 
component of velocity. 
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} ( 25 ) 


Let us now exa min e how the velocity of the electron varies with 
time. We have ^ = 

Now since anticommutes with all the terms in H except —cix 1 p 1 , 
oe 1 H J rHa 1 = —oc 1 C<x 1 p 1 —Ca 1 p 1 <x 1 = —2 cp v 
and hence ^ = 2oc 1 H+2cp 1 

= — 2Ha 1 —2cp 1 . 

Since H and p x are constants, it follows from the first of equations 
< 25 ) that iM x = 2ol x H. (26) 

This diff erential equation in can be integrated immediately, the 
result being ^ = & o e - 2 iHW (27) 

where aj is a constant, equal to the value of cq when t = 0. The 
factor e~ 2im l h must be put to the right of the factor aj in (27) on 
account of the H occurring to the right of the oq in (26). The second 
of equations (25) leads in the same way to the result 

dq = e 2 im ^dc\. 

We can now easily complete the integration of the equation of motion 
for x v From (27) and the first of equations (25) 

cq = ( 28 ) 


and hence the time-integral of equation (24) is 

x x = \cmi e~ 2im m- 2 +o 2 p x H-H+a v (29) 

a x being a constant. 

From (28) we see that the x x component of velocity, — coq, consists 
of two parts, a constant part c 2 p x H - 1 , connected with the momentum 
by the classical relativistic formula, and an oscillatory part 


whose frequency is high, being 2 H/h, which is at least 2 mc 2 /h. Only 
the constant part would be observed in a practical measurement of 
velocity, such a measurement giving the average velocity through a 
time-interval much larger than h/2mc 2 . The oscillatory part secures 
that the instantaneous value of x x shall have the eigenvalues ± 0 . The 
oscillatory part of x x is small, being, according to (29), 

lch 2 4e- 2iHi IW-* = 

which is of the order of magnitude Hjmc , since (a^cp!#- 1 ) is of the 
or„der of magnitude unity. 
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70. Existence of the spin 

In § 67 we saw that the correct wave equation for the electron in 
the absence of an electromagnetic field, namely equation (5) or (8), is 
equivalent to the wave equation (4) which is suggested from analogy 
with the classical theory. This equivalence no longer holds when 
there is a field. The wave equation to be expected from analogy with 
the classical theory in this case is 

j^o+^oj — (p+^ A ) — rn 2 c 2 [|a;> = 0, (30) 

in which the operator is just the classical relativistic Hamiltonian. 
If we must multiply (9) by some factor on the left to make it resemble 
(30) as closely as possible, namely the factor 


. e 4 . e A \ 

Po+~ A o~Pi °>P+-A -P 3 mc, 
c \ c 


we get 


Po+l A o) -(®,P+~Aj -mV+^l^o+^ojJa.p+^Aj- 


-(o.p+^Aj^ + ^oj j|*> = 0. (31) 

We now use the general formula that, if B and C are any two 
three-dimensional vectors that commute with o, 

(a, B)(a, C) = ^ G x + a x a 2 B x C 2 +c r 2 cr x B % C x } 9 

123 

the summation referring to cyclic permutations of the suffixes 3, 2, 3, 
or (a, B)(a, C) - (B, C)+i 2 C t -B i €,) 

123 

= (B,C)+i(o,BxC). (32) 

Taking B — C = p+e/c.A, we find, since 

|p+^Aj x |p+^Aj = ^{pxA+Axp} 

= —iSe/c.eurlA = — vkejc.JP, 
where M is the magnetic field, that 

,,p + e -Aj = (p+tAj 


(33) 
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Also we have 

(po +^o)(c, P+ - c A) - P+ \ a) [po +3^0) 

= ~{a,p 0 A~. Apo+A 0 p—pA 0 ) 
c 

ihel 1 dA i a \ , ox 

—Tr'i -^+s ia - iA ij - 

where £ is the electric field. Thus (31) becomes 


|^o+~-^oj —— m2 ° 2 ——(<7,4#) ipi~{v> — 0* 

(34) 

This equation differs from (30) through having two extra terms in 
the operator. These extra terms involve some new physical effects, 
but since they are not real they do not lend themselves very directly 
to physical interpretation. 

To get an understanding of the physical features involved in the 
difference between (34) and (31) it is better to work with the Heisen¬ 
berg picture, this picture being always the more suitable one for 
comparisons between classical and quantum mechanics. The Heisen¬ 
berg equations of motion are determined by the Hamiltonian 


H = — eA 0 — cp-^a, p 3 mc 2 


(35) 


the generalization of (23) to the case when there is a field. Equation 
(35) gives 


(-+^°) = P + 'Aj-j-p 3 mc| 

= (<LP + ^Aj +m 2 c 

= (p + ?AY+m 2 c 2 +j(o,Jii) (36) 


with the help of (33). We have here the real part of the extra terms 
in (34) appearing without the pure imaginary part. For an electron 
moving slowly (i.e. with small momentum), we may expect the 
Heisenberg equations of motion to be determined by a Hamiltonian 
of the form mc 2 -\-H v where H x is small compared with me 2 . Putting 
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mc‘ lJ r H 1 for H in (36) and neglecting H\ and other terms involving 
c -2 , we get, on dividing by 2m 

fl ‘ +c4 «“2s( p +; A ) I +£<”" A ')- < S7 > 

The Hamiltonian H x given by (37) is the same as the classical 
Hamiltonian for a slow electron, except for the last term 


This term may be considered as an additional potential energy 
which a slow electron has in the quantum theory and may be 
interpreted as arising from the electron having a magnetic moment 
—he/2mc.a. This magnetic moment is the one assumed in§§ 41 and 
47 for dealing with the Zeeman effect and is in agreement with 
experiment. 

The spin angular momentum does not give rise to any potential 
energy and therefore does not appear in the result of the preceding 
calculation. The simplest way of showing the existence of the spin 
angular momentum is to take the case of the motion of a free electron 
or an electron in a central field of force and determine the angular 
momentum integrals. This means working with the Hamiltonian (23), 
or with the Hamiltonian (35) with A = 0 and A 0 a function of the 
radius r, i.e. R = _ e4o(r) _ Cpi(ffj p )- Pz mc\ (38) 


and obtaining the Heisenberg equations of motion for the angular 
momentum. With either Hamiltonian we find for the rate of change 
of the ^-component of orbital angular momentum, m 1 — x 2 p z —x z p 2 , 
with the help of commutation relations proved in § 35, 

%hm x = 

— — <7>iW®, P)—(o.P)%) 

= — iKepfaPi— <r 3 i> 2 }. 

Thus m, # 0 and the orbital angular momentum is not a constant 
of the motion. This result is to be expected from the integrated 
equation of motion (29), the oscillatory part of the motion here dis¬ 
played giving rise to an oscillatory term in the angular momentum. 
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We have further 

iho 1 = a x H — Ha 1 

= P)-(®» P)<*l} 

= ~cp 1 (a 1 a—csa v p) 

= — 2icp 1 {a s ^ a — a 2 p 3 ] 
with the help of equations (51) of § 37. Hence 

= 0 , 

so that the vector is a constant of the motion. This result 

one can interpret by saying the electron has a spin angular momentum 
J&o, which must be added to the orbital angular momentum m before 
one gets a constant of the motion. The spin angular momentum 
could alternatively be obtained from the rotation operators for states 
of spin in accordance with the general method of § 35. 

The same vector 0 fixes the directions of both the spin magnetic 
moment and the spin angular momentum. If an electron in a certain 
state of spin has a spin angular momentum of \h in a particular 
direction, it will have a magnetic moment — ehj2mc in the same 
direction. 

71. Transition to polar variables 

For the further study of the motion of an electron in a central field 
of force with the Hamiltonian (38), it is convenient to make a 
transformation to polar coordinates, as was done in § 38 in the 
non-relativistic case. We can introduce r and p r as before, but 
instead of k , the magnitude of the orbital angular momentum m, 
which is no longer a constant of the motion, we must now use the 
magnitude of the total angular momentum M = m+JSo. Let us put 
jW = (39) 

The eigenvalues of ra 3 are integral multiples of h , those of \K<j are 
±P, and hence those of M z must be half-odd integral multiples of 
h. It follows from the theory of § 36 that the eigenvalues of |j| must 
be integers greater than zero. 

If in formula (32) -we take B = C = m, we get 

(a,m ) 2 = m 2 +?:(c, mxm) 

= m 2 —S(c, m) 

= (m+p.0) 2 -2S(0,m )~~§£ 2 . 

{(0,m )+&} 2 — M 2 +^ 2 . 


Hence 
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Thus (a, m )+K is a quantity whose square is M 2 -f£S 2 and we could, 
consistently with equation (39), define as ( 0 , 1 x 1 )+#. This would 
not be the most convenient definition for j, however, since we would 
like to have j a constant of the motion and (a, m)+# is not constant. 
We have, in fact, from applications of (32), 

(a, m)(c, p) = i(a,mxp) 

and (a, p)(a, m) = i(a, p X m), 

so that 

(cr, m)(c, p)+(a, p)(a, m) = i 2 ^i{^ 2 P 3 ~ m 3 P 2 +P 2 m 3 --P 3 m 2 } 

123 

= i 2 ctj.2 Mpi = —2#(a, p), 

123 

or {(a,m)+#}(cr,p)+(a, p){(a,m)+#} = 0. 

Thus (a, m)+# anticommutes with one of the terms in the expression 
(38) for H, namely the term — cp x (o , p), and commutes with the other 
two. It follows that p 3 {(a, m)+#} commutes with all the three terms 
in H and is a constant of the motion. But the square of p 3 {( 0 , m)+#} 
is also M 2 +J# 2 . We can therefore take 

j® = /> 8 {(a,m)+«}, (40) 

which gives us a convenient rational definition for j which is consis¬ 
tent with (39) and makes j a constant of the motion. The eigenvalues 
of this j are all positive and negative integers, excluding zero. 

By a further application of (32), we get 

(a, x)(a, p) = (x, p)+i(o,m) 

= rp r +ip z jK—ili, (41) 

with the help of (40) and also of equation (58) of § 38. We introduce 
the linear operator e defined by 

re = p x (a, x). (42) 

Since r commutes with p x and with (a, x), it must commute with <•. 
We thus have 

r 2 e 2 = [ Pl (u } x)] 2 = (a, x) 2 = x 2 = r 2 , 
or € 2 — 1. 

Now p 1 ( 0 , p) commutes with j 3 and since there is symmetry between 
x and p so far as angular momentum is concerned, p x (a, x) must also 
commute with j. Hence e commutes with j. Further, e must commute 
with p r , since we have 

(a, x)(x, p)—(x, p)(a, x) = (a, x(x, p)-(x, p)x) = i»(o, x). 
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which gives 


rerp t —rp r re = Mr* 

or r 2 ep r —r 2 p r e = 0. 

From (41) and (42) we obtain 

re Pl (a, p) = rp T +ip 3 jft—ifi 
or pi(o, P) = e(p r -i%lr)+iep s jKlr- 

Thus (38) becomes 

Hjc = —elc.A 0 -e{p r -itilr)-iep 3 jnir-p & mc. 

This gives our Hamiltonian expressed in terms of polar variables. It 
should be noticed that e and p 3 commute with all the other variables 
occurring in H and anticommute with one another. This means that 
we can take a representation with p 3 diagonal in which e and p 3 are 
represented respectively by the matrices 

(• -;)■ ft -“)• (43) 

If r is also diagonal in the representation, the representative 
< r 'p'l) of a ket will have two components, </, 1|> = 'p a ( r ') and 
<r', —1|> = tp b (r') say, referring to the two rows and columns of the 

matrices (43). 

72. The fine-structure of the energy-levels of hydrogen 

We shall now take the case of the hydrogen atom, for which A 0 = e/r, 
and work out its energy-levels, given by the eigenvalues H' of H. 
The equation (H'—H) \H'} = 0 which defines these eigenvalues, when 
written in terms of representatives in the representation discussed 
above with e and p3 represented by the matrices (43), gives the 
equations 

(? +r^.-*[~+iy J - r h+^K = 0 . 


l -'Pa~ mc '1 J b = °‘ 


If we put - , 

r mc-fH/c 

these equations reduce to 


= a 


mc—H r ]c 


— a 






$r 1 r 
8__ j -1 


(44) 


(45) 


\h = °> 
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where a = e 2 jhc, which is a small number. We shall solve these equa¬ 
tions by a similar method to that used for equation (73) in § 39. 

Put <A 0 = ijj b == r-h-^g, (46) 

introducing two new functions, / and g, of r, where 

a = (cqaa)* = ^(m 2 c 2 —£f' 2 /c 2 )-h (47) 

Equations (45) become 




(48) 


We now try for a solution in which / and g are in the form of power 
series, f=Zc s r\ g = Zc' s r°, (49) 


in which consecutive values of s differ by unity though these values 
need not be integers. Substituting these expressions for / and g in 
(48) and picking out coefficients of r®- 1 , we obtain 


Cs-iK+^-is+^c's+c'^Ja = 0 , 

—c's-l K+^'s+(s-jK- c s-ll a = °- 


(50) 


By multiplying the first of these equations by a and the second 
by a z and adding, we eliminate both c s _ 1 and since from 
(47) a/a 1 = aja. We are left with 

[aa+a a (s-j)]c s +[a 2 a-a(s+j)]c; = 0, (51) 

a relation which shows the connexion between the primed and un¬ 
primed c’s. 

The boundary condition at r = 0 requires that np a and rifj b -» 0 as 
r -* 0, so from (46) / and g 0 as r -> 0. Thus the series (49) must 
terminate on the side of small s. If s 0 is the minimum value of s for 
which c s and c’ s do not both vanish, we obtain from (50), by putting 
s = Sq and c So _ 1 = = 0, 


°^Sa ( 5 0+j) C s o 


(52) 


“Cs<,+ (So—j) c *„ = °> 
which give a 2 = — 6 ’o~j 2 - 

Since the boundary condition requires that the minimum value of s 
shall be greater than zero, we must take 

i'o = +v(f—“ 2 )- 
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To investigate the convergence of the series (49) we shall determine 
the ratio c 8 jc 8 ~ x for large 5. Equation (51) and the second of equations 
(50) give approximately, when s is large, 

a 2 c s = ac' s 

and = c^Ja+c^Ja^. 

Hence c sl c s-i = 2/as. 

The series (49) will therefore converge like 



or e 2r!a . This result is similar to that obtained in § 39 and allows us 
to infer, as in § 39, that all values of H' are permissible for which a 
is pure imaginary, i.e. from (47), for which H' > me 2 , while for 
H' < me 2 we take a to be positive and then find that only those 
values of H' are permissible for which the series (49) terminate on 
the side of large <?. 

If the series (49) terminate with the terms c s and c', so that 
°s+i = 4+i = we obtain from (50) with s-j-l substituted for s 

c J a i J r c s/ a “ 

s IT si (53) 

c s /a 2 c g /a — 0. 

These two equations are equivalent on account of (47). When com¬ 
bined with (51), they give 

a^aoc+a^s—j)] = a[a 2 oc—a(s+j)]; 
which reduces to 2 a ± a 2 s — a(a 2 —«i)a 


with the help of (44). Squaring and using (47), we obtain 
s 2 (m 2 c 2 —H' 2 /c 2 ) = a 2 fT 2 /c 2 . 


Hence 


_ _ = 11 _ 
me 2 \ * 5 2 


The 8 here, which specifies the last term in the series, must be greater 

than s 0 by some integer not less than zero. Calling this integer n, 

we have . „ . 2 

5 = n+V0 2 —a 2 ) 


and thus 


me 2 ^ 1 + {n+V(j2_ a 2)}2 
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This formula gives the discrete energy-levels of the hydrogen 
spectrum and was first obtained by Sommerfeld working with Bohr’s 
orbit theory. There are two quantum numbers n and j involved, but 
owing to a 2 being very small the energy depends almost entirely on 
n~\-\ j|. Values of n and \j\ that give the same n+\j\ give rise to a 
set of energy-levels lying very close to one another, and to the 
energy-level given by the non-relativistic formula (80) of § 39 with 
$ = n+lil, apart from the constant term me 2 . 

We used equations (53) by combining them with (51), but this does 
not make full use of (53) since the coefficients of c s and c s in (51) may 
both vanish. In this case we get, multiplying the first coefficient by 
a and the second by and adding 

(a 2j r al)ot~~2aa 2 j = 0. 

With the help of (47) and (44) this gives 

(a 1 +a 2 )a = 2 aj 


or 


^ = ^ 1+^2 = 1+1 

oc a' a 


2 mca 
ft 


2mc 

(m 2 c 2 —H' 2 /c 2 )t 


or 


H ' 2 = «• 

m 2 c 4 


Since H' must be positive, this leads to 

B’ V(i 2 -a 2 ) 

me 2 |j| 


(55) 


which is the value of H’ given by (54) when n — 0. The case n — 0 
thus needs further investigation to see whether the conditions (53) 
are then fulfilled. 

With n — 0, the maximum value of s is the same as the minimum, 
so equations (53) with s 0 substituted for s should agree with (52). 
Now (55) gives, from (44) and (47), 

1 _ mcT VO' 2 — « 2 ) \ 1 me o: 

+ |j| /’ a ft \j\’ 

so the first of equations (53) with s 0 substituted for s gives 
c S o{b’l+V(f— '^)}+ c ' St a = 0. 

This agrees with the second of equations (52) provided j is negative. 
We can conclude that, for n = 0,j must be a negative integer, while 
for the other values of n all non-zero integral values of j are allowed. 
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73. Theory of the positron 

It has been mentioned in § 67 that the we equation for the elec¬ 
tron admits of twice as many solutions as it ought to, half of them 
referring to states with negative values for the kinetic energy cp 0 +eA 0 . 
This difficulty was introduced as soon as we passed from equation (3) 
to equation (4) and is inherent in any relativistic theory. It occurs 
also in classical relativistic theory, but is not then serious since, owing 
to the continuity in the variation of all classical dynamical variables, 
if the kinetic energy cj> 0 + eA 0 is initially positive (when it must be 
greater than or equal to me 2 ), it cannot subsequently be negative 
(when it would have to be less than or equal to -me 2 ). In the 
quantum theory, however, discontinuous transitions may take place, 
so that if the electron is initially in a state of positive kinetic energy 
it may make a transition to a state of negative kinetic energy. It is 
therefore no longer permissible simply to ignore the negative-energy 
states, as one can do in the classical theory. 

Let us examine the negative-energy solutions of the equation 

|^o+^oj+°^:Pi+- 4 ij + 

-f £ y 2 |p 2 +^ 2 J+ a3 ^ 3 +^ 8 j+ am mc)|a : > = 0 (66) 

a little more closely. Tor this purpose it is convenient to use a repre¬ 
sentation of the cl's in which all the elements of the matrices repre¬ 
senting ol v oc 2 , and a 3 are real and all those of the matrix representing 
a m are pure imaginary. Such a representation may be obtained, for 
instance, from that of § 67 by interchanging the expressions for a 2 
and <x m in (7). If equation (56) is expressed as a matrix equation in 
this representation and we put —i for i in all the matrix elements, 
we get, remembering (1) and (2), the matrix form of the equation 

| ^”#0+^oj+*i j + 

where \x*} is the ket whose representative is the conjugate complex 
of the representative of \x). Thus each solution \%y of (56) deter¬ 
mines uniquely a solution |a;*> of (67) with the conjugate complex 
representative. Further, if the solution |*> of (56) belongs to a 
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negative value for cp 0 +eA 0 , the corresponding solution |#*> of (57) 
will belong to a positive value for cp 0 — eA 0 . But equation (57) is just 
what one would get if one substituted — e for e in (56). It follows 
that each negative-energy solution of (56) corresponds to a positive- 
energy solution of the wave equation obtained from (56) by substitu¬ 
tion of — e for e, which solution represents an electron of charge -f-e 
(instead of —e, as we had up to the present) moving through the 
given electromagnetic field. Thus the unwanted solutions of (56) are 
connected with the motion of an electron with a charge +e, (It is 
not possible, of course, with an arbitrary electromagnetic field, to 
separate the solutions of (56) definitely into those referring to positive 
and those referring to negative values for cp 0 +eA 0 , as such a 
separation would imply that transitions from one kind to the other 
do not occur. The preceding discussion is therefore only a rough 
one, applying to the case when such a separation is approximately 
possible.) 

In this way we are led to infer that the negative-energy solutions 
of (56) refer to the motion of a new kind of particle having the mass 
of an electron and the opposite charge. Such particles have been 
observed experimentally and are called positrons. We cannot, how¬ 
ever, simply assert that the negative-energy solutions represent posi¬ 
trons, as this would make the dynamical relations all wrong. For 
instance, it is certainly not true that a positron has a negative kinetic 
energy. We must therefore establish the theory of the positrons on 
a somewhat different footing. We assume that nearly all the negative- 
energy states are occupied , with one electron in each state in accordance 
with the exclusion principle of Pauli. An unoccupied negative-energy 
state will now appear as something with a positive energy, since to 
make it disappear, i.e. to fill it up, we should have to add to it an 
electron with negative energy. We assume that these unoccupied 
negative-energy states are the positrons. 

These assumptions require there to be a distribution of electrons 
of infinite density everywhere in the world. A perfect vacuum is a 
region where all the states of positive energy are unoccupied and all 
those of negative energy are occupied. In a perfect vacuum Maxwell’s 

equation d ivd> = 0 

must, of course, be valid. This means that the infinite distribution 
of negative-energy electrons does not contribute to the electric field. 

3595.57 r 
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Only departures from the distribution in a vacuum will contribute 
to the electric density p in Maxwell’s equation 

div£ = 

Thus there will be a contribution —e for each, occupied state of posi¬ 
tive energy and a contribution +e for each unoccupied state of 
negative energy. 

The exclusion principle will operate to prevent a positive-energy 
electron ordinarily from making transitions to states of negative 
energy. It will still be possible, however, for such an electron to 
drop into an unoccupied state of negative energy. In this case we 
should have an electron and positron disappearing simultaneously, 
their energy being emitted in the form of radiation. The converse 
process would consist in the creation of an electron and a positron 
from electromagnetic radiation. 

The theory of the positron here given appears at first sight to treat 
the electrons and positrons on very different footing^, but actually 
the fundamental ideas of the theory are symmetrical between the 
electrons and positrons. We should have an equivalent theory if we 
supposed the positrons to be the basic particles, described by wave 
equations of the form (9) with —e for e, and then supposed that nearly 
all the states of negative energy for the positron are filled up, a hole 
in the distribution of negative-energy positrons being then inter¬ 
preted as an ordinary electron. The theory could be developed 
consistently with the hypothesis that all the laws of physics are 
symmetrical between positive and negative electric charge. 
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74. Relativistic notation 

In § 63 a theory was given of the interaction of an atom with a field 
of radiation. This theory was an approximate one, valid for radiation 
of long wave-length and for a certain simplified model of the atom. 
Our present problem is to improve this theory, and in particular to 
make it relativistic, so that it may be applied to particles moving at 
high speed. We must first set up a notation suitable for handling the 
relativistic equations with which we shall have to deal. 

We choose units of space and time which make the velocity of light 
unity, so that c will no longer appear in our equations. A point in 
space-time is located by its three Cartesian coordinates x v x 2 , x 3 and 
its time t = # 0 , which together form a 4-vector (fx = 0, 1, 2, 3), or 
x as we may write it in vector notation. Two 4-vectors a and b have 
a Lorentz-invariant scalar product (ab) given by 

(ab) = a 0 6 0 -a 1 6 1 -a 2 6 2 —a 3 b 3 = a 0 b 0 -( ab), (1) 

(ab) being the three-dimensional scalar product of the three-dimen¬ 
sional parts of a and b. To take into account the —- signs in (1), it 
is convenient to introduce vector components with raised suffixes, 
defined by 

a° = a n a 1 = —c a 2 = —a 2 , a z = —a 3 , (2) 

so that the scalar product (ab) may be written 

(ab) = aPbp = (3) 

a summation being implied over a repeated (letter) suffix in a term. 
The components av are called the covariant components of the 4-vector 
a, the original components a^ which transform like the four coordi¬ 
nates Xp of a point in space-time, being called the contravariant 
components . 

The fundamental tensor g^ v is defined by 

900 ” L 9ll ~ ?22 == ?33 = — 1, 

9p V = 0 for 

With its help we can write the rule (2) connecting the covariant 
and contravariant components of a vector 

g»v* v = 
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and we can write the scalar product (ab) as 

(ab) = grealty. 

The operators d/dx^ form the covariant components of a 4-vector, 
and the contra variant components of the vector are written d/dx^. 
Equations (I) and (2) of § 66 may be written . 


jpf* = m 


d 

dx„ 



( 5 ) 


and show how the momentum-energy 4-vector of a particle is related 
to the operator of differentiation applied to the wave function. 

The function S(xx) is evidently Lorentz invariant. It vanishes 
everywhere except on the light-cone with the origin as vertex, i.e. the 
three-dimensional space (xx) = 0. This light-cone consists of two 
distinct parts, & future part, for which x 0 > 0, and a, past part, for which 
x Q < 0. The function which equals S(xx) on the future part of the 
light-cone and —§(xx) on the past part of the light-cone is also 
Lorentz invariant. This function, which equals 8(xx)x 0 /\x 0 \, plays 
an important role in the dynamical theory of fields, so we introduce a 
special notation for it. We define 

A(x) = 28(xx)a? 0 /|a? 0 |. (6) 

This definition gives a meaning to the function A applied to any 
4-vector. With the help of (1) and of (9) of § 15, we can express 
S(xx) in the form 

S(xx) = |x|)+S(z 0 + |x|)}, (7) 

|x| being the length of the three-dimensional part of x, and then 
A(x) takes the form 

A(x)= |x|- 1 {8(a; 0 -|x|)~S(a; 0 +|x|)}. (8) 

A(x) is defined to have the value zero at the origin, and evidently 
A(-x) = —A(x). 

Let us make a Fourier analysis of A(x). Using d 4 x to denote 
dx 0 dx Y dx 2 dx 3 and d s x to denote dx x dx„ dx 3 we have, for any 4-vector k, 

J A(x)e i(kx) d 4 x = I |x|- 1 {8(a: 0 -|x|)-8(a; 0 +|x|)}e®o*.-(toc)]# x 


= j | x | — e -iko\xl J e - 2 '(kx) tf3 x 

By introducing polar coordinates |x|, 6, <f> in the three-dimensional 
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x i x z x % space, with the direction of the three-dimensional part of k as 
pole, we get 

J A (x)e« k *>#x = JJJ (e«olxi_ e -ii:„ixi} e -i!tiix|oo 8 «| X | s i n 0 ^^|x| 

°° 7 T 

= 2 it j {e ik ‘M— e-**«M}d|x| J e-<iWM oo » 0 |x|sin 0 <20 
0 0 
oo 

= 27t^| k]— 1 J {^**o!xi—g-^oixi} djxKe-^iW—eWW} 

o 

oo 

= 2tt£I kI- 1 | {e^o-lk|)a_ e i(fro+lkl)a| 

= 47r 2 i | k |-i{S(^ 0 — | k |)—S(* 0 +| k |)} 

= 4rA’A(k). (9) 

Thus the Fourier analysis gives the same function again, with the 
coefficient 4ttH. Interchanging k and x in (9), we get 

A(x) = — i/4*r a . J A(k)e* (kx > d*k. (10) 

Some of the important properties of A(x) can easily be deduced 
from its Fourier resolution. In the first place equation (10) show's that 
A(x) can be resolved into waves all travelling with the velocity of 
light. To get an equation for this result w r e apply the operator □ to 
both sides of (10), thus 

□A(x) = — «/4 tt 2 . J A(k)De i < tx >#k = ij 4rr 2 . J (kk)A(k)e i<kx > #k. 

Now (kk)A(k) = 0, and hence 

□A(x) = 0. (11) 

This equation holds throughout space-time. We can give a meaning 
to DA(x) at a point where A(x) is singular by taking the integral 
of QA(x) over a small four-dimensional space surrounding the point 
and transforming it to a three-dimensional surface integral by Gauss’s 
theorem. Equation (11) informs us that the three-dimensional surface 
integral always vanishes. 

The function A(x) vanishes all over the three-dimensional surface 
x 0 = 0. Let us determine the value of 8&(x)/dx 0 on this surface. It 
evidently vanishes everywhere except at the point x ± = x 2 = x 3 = 0, 
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where it has a singularity which can be evaluated as follows. Differ¬ 
entiating both sides of (10) with respect to x 0 , we get 

8A(x)/2x 0 = 1/4tt 2 . J k 0 A(k)e«>“)#k 

= 1/4tt 2 . j k 0 1 k| | k |)—8(* 0 +1 k | )}e« kx > #k 

= 1/4tt 2 . J {S(fe 0 ~|kI)+S(*o+|k|)}e i<kx >#k. 

Putting # 0 = O' on both sides here, we get 

[aAW/azJ^o = 1/477 2 . J {S(fc 0 -|k|)+8(fe 0 +|k|)}e-^>^k 

= 1/27T 2 . J e- i(kx) d 3 k 

= 4tt S(^i)S(^ 2 ) § (^3)- ( 12 ) 

Thus the ordinary S singularity, with the coefficient 47?, appears at 
the point x x = # 2 = x 5 = 0. 

75. The quantum conditions for the field 

In § 63 a theory of a field of radiation without interaction with 
matter was first developed and the interaction was taken into account 
subsequently. In the theory without interaction dynamical variables 
were introduced to describe the field, commutation relations were 
established for these dynamical variables, and a Hamiltonian was set 
up which made the dynamical variables vary correctly with the time. 
No approximations were made in this work. The theory would there¬ 
fore be a quite satisfactory, exact theory of radiation without inter¬ 
action with matter, were it not for one feature in it, namely our 
taking the scalar potential to be zero at the outset. This feature 
spoils the relativistic form of the theory and makes it unsuitable as 
a starting-point from which to develop an accurate theory of radiation 
in interaction with matter. We shall here consider how to put the 
theory of radiation without interaction with matter into relativistic 
form. 

We leave the scalar potential A 0 arbitrary and it then forms, 
together with the vector potential A v A 2 , A z , a 4-vector A^ The 
Maxwell equations (62) of § 63 must then be generalized to 

UAp = 0 , dAp/dx^ = 0 . ( 13 ) 

For the present we shall ignore the second of these equations and 
work only from the first. This equation shows that each A can be 
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resolved into waves travelling with the velocity of light, so that its 
Fourier resolution is of the form 

Api*) = 2 J 8 ( kk M k/i e«k*> d*. k, (14) 

x denoting a general point in space-time. The factor 8(kk) here 
ensures that the integrand vanishes except for those values of the 
4-vector k which satisfy (kk) = 0, and the coefficient A k ^ may be 
considered as undefined except when (kk) = 0. Since A^x) is real, 
we must have A_ kfZ = A kfl , so (14) may also be written 

A^x) = 2 J S(kk){4 t;i e i(kx) -j-A k/l e~ i(kx >} #k. (15) 

7c 0 >0 

With the help of formula (7) applied to k, this goes over into 
A/x) = J jk|- x S(k 0 - |k|){J k;x e ^)+ A kfl 

k Q >0 

= J (A k/X e i(kx) -j-A k/i d 3 k, (16) 

where it is implied in the last integrand that k Q = |k|, i.e. that k is 
a 4-vector lying on the future part of the light-cone. 

Equation (16) is usually the most convenient form in which to give 
the Fourier resolution of A ft . For /x = 1, 2, 3 it agrees with (63) of 
§ 63, except for the factor k^ 1 in (16). This factor is a desirable one 
to have in a relativistic theory, since the product k$ l d ?k gives a 
Lorentz invariant element on the light-cone (kk) = 0. The Lorentz 
invariance can be proved by direct geometrical methods, and can 
also be inferred from the above analysis, it being evident that the 
coefficient A kfX introduced by (14) is a 4-vector for each value of k 
on the light-cone, so that the factor {} in (16) is also a 4-vector, and 
hence the remaining factors on the right-hand side of (16) must form 
a four-dimensional scalar. 

The quantities A and dAJdx 0 for all x v x 2 , x z ^ a given time 
x 0 = t are sufficient, with the help of the first of equations (13), to 
determine the potentials throughout space-time, so these quantities 
may be considered as the dynamical variables describing the field of 
radiation considered as a dynamical system. (They are the ordinary 
dynamical variables of the classical theory, or the Heisenberg dynami¬ 
cal variables of the quantum theory.) Define the quantities A k ^ for 

k ° > 0 hy A kfit = A kfl e ik ‘ Xt . (17) 
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^a(X) 


dA^xj/dXg = i J {A klU t 


/ {Atp e-W+^k^ e^V 1 




& 0 being understood as equal to |k| in the integrands here. These 
equations express A and BA^/Bxq at time t as functions of A kfxi and 
A k{jU not involving t explicitly. By reversing the three-dimensional 
Fourier analysis of equations (18) we can get A k[xt and A kjia as 
functions of A and BA^Bx 0 at time t not involving t explicitly. 
Hence we may take A k[U and A k[Mt for all \x and all k with k 0 > 0 as 
the dynamical variables describing the system, instead of A^ and 
BA^Bxq at time t. 

We must now determine the quantum conditions for the A ktli and 
A kfli . In the first place, variables referring to different values of k or 
of [x belong to different degrees of freedom and therefore commute. 
We can get information about the quantum conditions for variables 
referring to the same value of k and jju from the work of § 63. To 
connect up with this W’ork, we pass over to discrete k-values in three- 
dimensional k-space. Equation (73) of § 63 gives, on taking into 
account that the present A k variables are k 0 times those of § 63, 

2irA k]i = (19) 

Let us consider one particular discrete k-value for which k x = k 2 = 0, 
k s = k 0 > 0. Then the polarization variable 1 can take on two values 
referring to the tw r o directions 1 and 2, so equation (19) gives, with 
the help of the commutation relations for the rf s and if s, equations 

^ ^ } A kli A klt ~-A kli A kl( = fik Q s k l4:7r 2 , 

^•k2^k2i J ^k2t^-‘k2t = ^'O^kM 73 ' 2 * 

With the help of (17), these equations may be written in terms of the 
A kfM , A kfl for k 0 > 0_ 

•^■kl^kl -^kl-^kl “ fi'k 0 S k /4:7T 2 3 j 

-^k2^k2 -^■k2^k2 “ ^0 ^kM 77 * 2, 

The work of § 63 gives us no information about A k3 and A k0 . 

However, we can now obtain the quantum conditions for A k 3 and 
A k0 from the theory of relativity. Equations (21) have to he built 
up into a relativistic set of equations and the only simple way of doing 
so is by adding to them the two further equations 

^k3^k3 ^k3-^k3 = Sk/4'JT 2 , 


^k3^k3~ 

■^•kO^kO’ 


-A k3 A k3 : 
"A k0 A k o 


~fik Q s k /47r 2 . 


(22) 
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Note the opposite sign in the last of these equations. The four equa¬ 
tions (21) and (22), together with the conditions that A kfl and A kv 
commute for fx ^ v, can then be written as a single tensor equation 


■^•kfx^kv ^kv^kfx — 9\lv ^0 ^ki (23) 
We get in this way the quantum conditions for all the dynamical 
variables. Equation (23) can be extended to 

^k/x-^k'y -4k'v-^k/x ^ ^ v /4tt 2 . Tlic^ 6^ (24) 

Let us now return to continuous k-values. To convert 8 fck , to con¬ 
tinuous k-values we note that, for a general function/(k) in three- 
dimensional k-space, 

|/(k) S kk , = /(k') = J/(k) 8,(k-k') d»k, (25) 

where S 3 (k— k') is the three-dimensional § function 

S 3 (k—k') = A4)S(* a —*i)S(fc B —*4). 

In order that (25) may conform to the standard formula connecting 
sums and integrals, equation (52) of § 62, we must have 

*kS kk '= 8 3 (k-k'). (26) 

Thus (24) goes over to 

k (j.A k > v A k > p A k p — —g ^ V J . ?iJcq 8 3 (k k). (27) 

This equation, together with the equations 

Aj^Ayy-A^A^ = 0, ( 2 g) 

-^k/x^k'v ~^k'v J ^kfi 


provide the quantum conditions for the field quantities in the theory 
with continuous k-values. We have here the formalism which must 
be used instead of (11) of § 60 for dealing with a set of oscillators 
whose number is a continuous infinity, equal to the number of points 
in a volume. The number of degrees of freedom of the system is a 
continuous infinity, and the 2 function appears in the commutation 
relations instead of the two-suffix 8 symbol. 

The quantum conditions for the field may also be expressed in 
terms of the potentials ^(x) at different points x in space-time. 
We have from (16), (27), and (28) 
oyxM„(x')] 

= JJ [A^e^>+l^e^<rt,A Vv e*<x&+A Vv erW&]ki%- 1 d?M i k' 
— JJ {e- i<fcx) e ! ' (kV) — e i(kx) e- i(kV) } § 3 (k—k')^- 1 d 3 ki 3 k' 

= wfyj„/47T 2 . J {e-^Obe i < k > x - x ' ) }&o 1 d 3 k. (29) 
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Tins three-dimensional k-integral is easily seen to be equal to the 
four-dimensional k-integral over the whole of four-dimensional 
k-space 

| |k| -1 {8(i 0 — |k|)~8(fe 0 + |k|)}e- i(k - x_x ')d 4 k 

= ig,J^ 2 - J A(k)e“ i(k > x ~ x ' ) #k. 

Evaluating this integral with the help of formula (10), we get finally 
[^(x),4 v (x')] = ^A(x-x'). (30) 

We see that potentials at two points in space-time always commute 
unless the line joining the two points is a null-line (i.e. the track of 
a light-ray). 

Let us determine the quantum conditions for the quantities A M and 
dA^jbx^ for various x l9 x 2 , x 3 at a given time x 0 = t Using the suffix 
t to denote a quantity taken at the time x 0 = t, we have, putting 

* 0 = 4 « «in (30), [A lti (x),A vt (x')] = 0 (31) 

Differentiating (30) with respect to x 0 and then putting x 0 ~ x' Q = t, 
we get 




(32) 


from (12). Finally, differentiating (30) with respect to x 0 and x' 0 and 
then putting x Q = x' 0 = t, we get 


-/^x)) i _ Q 

A ;< 8x 'o u 


since d 2 A(x)/&r§ = 0 for x 0 = 0. We can, as stated on p. 279, take 
the quantities A^x) and {dA fX (x)/dx 0 } t as the dynamical variables 
describing the system, and equations (31), (32), and (33) are then the 
quantum conditions for these dynamical variables. From the form 
of these quantum conditions we see that, apart from numerical 
coefficients, the ^4^(x)’s can be looked upon as a set of coordinates 
q and the {dA^xj/dx ^ 9 s as their conjugate momenta p, there being 
a S function on the right-hand side of (32) instead of a two-suffix S 
symbol on account of the number of these q’s and #>’s being a con¬ 
tinuous infinity. The quantum conditions (31), (32), (33) still hold 
if the radiation is in interaction with matter, and indeed in all Lorentz 
frames of reference, but the more general condition (30) need not then 
hold, since the commutation relations connecting dynamical vari¬ 
ables at different times in general get altered by interaction. 
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The electric and magnetic fields £ and M form in relativistic 
notation a 6-vector F^ v = —F 


^1 — 6% = F t £ 3 = F 30 , 

= -^32> <^2 = ^13> >#s = -?21- (34) 

The equations connecting £ and ># with the potentials may be 
written in tensor form 


8A„ 8A„ 

dxv- 8x v 


(35) 


The quantum conditions connecting £ and M at different points in 
space-time can be obtained immediately from (35) and (30). 


76. The Hamiltonian for the field 

The Hamiltonian for the field, H R say, must be chosen so as to 
give the correct Heisenberg equations of motion for the dynamical 
variables. This suffices to fix it, except for an arbitrary constant. 
From (17), the dynamical variables A k/xl vary with t or x Q according 
to the law dA ^ /dt = iKA ^ 

Thus from the Heisenberg equations of motion 


ifi dA kfxt /dt — A k ^ t H R ~-~H R A kllt 

we get M 0 A kfJLt = A kfJLt H R —H R A kjJLt . (36) 

We must choose H R to satisfy these conditions. 

Let us pass over to discrete k-values and consider again one 
particular k-value for which k x — Jc 2 = 0, k z = k 0 > 0. We then 
have the commutation relations (20), which show us that, so far as 
concerns the degrees of freedom A kli and A k2i , H R must consist of 
the terms ^(A kll A kll +A k2l A k2l )s k \ (37) 


as these terms substituted for H R in the right-hand side of (36) make 
it equal the left-hand side. These terms are in agreement with (72) 
of § 63, if one takes into account that the A k s there differ from the 
present ones by the factor k 0 . For the degrees of freedom A k3( and 
we have, from (22), the commutation relations 

■^k3£^k& ^Ik3t^k3t “ ^O^kM 77 " 2 ? 
^■kOi^kOi^^kOt^kOt — ~^°0 S k 

which show similarly that H R contains the terms 
4ffA{A ^0/ AA k 0 ^A k 0 i)s k 1 
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It is convenient to change this by a constant and to take instead the 

6rinS 47T 2 (^4 k 3^-4 k3f -^kO^kO^k 1 * (39) 

as it will he found later that (39) gives no zero-point energy to H R . 
The total Hamiltonian is now 

H r = 4tt 2 J (^kl-^-kl+^k2^k24-^k3^-k3““^k0^ko) 5 k 1 (^0) 

k 

= 4?7 2 J (^ k x^ki^^k2-^k2“f“^k3-^k3 ^kO^ko) (^1) 

if we pass back to continuous k-values. This H R gives, according to 
(17) of § 29, eimtinA ^ e -m R t\h = A kvl = A kil e’M (42) 

We may call Tongitudinal degrees of freedom 5 the degrees of free¬ 
dom associated with the variables A k0i and A ksi for the particular 
k-value considered above, in contradistinction to the ‘transverse 
degrees of freedom 5 associated with the variables A klt and ^4 k2; . For 
a general k-value A kzi is to be replaced by A kKii x being a unit three- 
dimensional vector in the direction of the three-dimensional part of k. 
The longitudinal degrees of freedom do not occur in the theory of 
§ 63, A k0 and .4 kK there being zero. The present Hamiltonian (40) 
differs from the Hamiltonian (72) of § 63 by the terms referring to 
the longitudinal degrees of freedom, these terms being needed now 
to make A k0t and A kKt vary correctly with t. 

We see from (39) that the contribution of the degree of freedom 
A k01 to the Hamiltonian is negative. This means that the dynamical 
system formed by the variables A m , A km is a harmonic oscillator of 
negative energy . It is rather surprising that such an unphysical idea 
as negative energy should appear in the theory in this way. The 
negative energy is a necessary consequence of the — sign on the 
right-hand side of the second of equations (38) and this —* sign is 
demanded by relativity. We shall see in the next section that the 
negative energy associated with the degree of freedom A k0t is always 
compensated by the positive energy associated with the correspond¬ 
ing longitudinal degree of freedom A kKt , so that it never shows up in 
practice. 

The theory of a harmonic oscillator of negative energy may be 
built up in the same w r ay as that of an ordinary harmonic oscillator 
given in § 34. Expressing the A kot of the second of equations (38) in 
terms of 7} by means of 

2 irA k0t = 4 7 ?, 
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we have 77 satisfying the same commutation relation with rj as in 
§ 34, and the energy in this degree of freedom is ~-M 0 7777 , from (39). 
The work of § 34 now shows that the maximum eigenvalue of the 
energy is zero, the other eigenvalues being negative integral multiples 
of Introducing the normalized eigenket of the energy belonging 
to the eigenvalue zero as the standard ket | 0 >, we have 

77|0> = 0 

as in § 34, and 77 n | 0 ) with n a positive integer is the ket corresponding 
to the nth quantum state, which has the energy —nhk 0 . Any ket can 
be expressed as a power series in 77 multiplied into | 0 >. 

For the whole field of radiation we can introduce a standard ket } F 
for which there is zero energy in each degree of freedom. Any state 
of the field of radiation then corresponds to a ket of the form of a 
power series in the various 77 -variables multiplied into } F . We can 
replace the power series in the 77 -variables by a power series in the 
Fourier coefficients A kl , A k2 , A k3 , A k0 . The different terms in the 
power series correspond to different degrees of excitation of the various 
Fourier components of the field. Alternatively, they correspond 
to different numbers of photons present in the various stationary 
states of a photon, there being now longitudinal photons associated 
with the longitudinal degrees of freedom, as well as the usual trans¬ 
verse ones. (The physical significance of the longitudinal photons 
will become clear later, see p. 305.) If we are working with continuous 
k-values, the power series in A kl , A k2 , A k3 , A k0 becomes a sum of 
integrals of degree 0 , 1 , 2 , 3,... in these variables. Any of the linear 
operators A kl , Z k2 , A k3 , A k0 applied to ) F gives zero. 

77. The supplementary conditions 

We must now go back to the second of the Maxwell equations (13), 
which we have ignored so far. We cannot take this equation over 
directly into the quantum theory without getting inconsistencies. 
The left-hand side of this equation does not commute with A v (x'), 
according to the quantum conditions (30), so this left-hand side 
cannot vanish. The way out of the difficulty was shown by Fermi.f 
It consists in adopting a less stringent equation, namely the equation 

(dAJdx^y = 0, # (43) 

and assuming it to hold for any |> corresponding to a state that can 
f Fermi, Reviews of Modern Physics , 4 (1932), 125. 
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actually occur in nature. There is one equation (43) for each point 
in space-time and these equations must all hold for any ket corre¬ 
sponding to a state that can actually occur. The ket in (43) does not 
depend on t 3 since we are using the Heisenberg picture, in which each 
state corresponds to a fixed ket. 

We shall call a condition such as (43), which a ket has to satisfy to 
correspond to an actual state, a supplementary condition. The exis¬ 
tence of supplementary conditions in the theory does not mean any 
departure from or modification in the general principles of quantum 
mechanics. The principle of superposition of states and the whole of 
the general theory of states, dynamical variables, and observables, 
as given in Chapter II, apply also when there are supplementary 
conditions, provided we impose a further requirement on a linear 
operator in order that it may represent an observable, namely the 
requirement that, when it operates on any ket satisfying the supple¬ 
mentary conditions, it changes this ket into another ket satisfying 
the supplementary conditions. We have already had an example of 
supplementary conditions in the theory of systems containing several 
similar particles. The condition that only symmetrical wave func¬ 
tions, or only antisymmetrical wave functions, represent states that 
can actually occur in nature, is precisely of the same type as condition 
(43) and is what we are now calling a supplementary condition. In 
this theory the further requirement on linear operators in order that 
they shall represent observables is that they shall be symmetrical 
between the similar particles. 

When we introduce supplementary conditions into our theory we 
must verify that they are not too restrictive to allow any ket at all 
to satisfy them. If we have more than one supplementary condition, 
we can deduce further supplementary conditions from them by taking 
P.B.s of the operators in them; thus if we have 

U\y = 0 , F|> = 0 , (44) 

we can deduce 

IV.P]I> = o, [u ! [u,v]]\y = o, (45) 

and so on. To verify that our supplementary conditions are not too 
restrictive, we have to look into all the further supplementary condi¬ 
tions obtainable by this procedure to see that they can be satisfied, 
which we can usually do by showing that after a certain point the 
further supplementary conditions are all either identically satisfied 
or repetitions of the previous ones. 
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To apply this procedure to the supplementary conditions (43), we 
work out the P.B. of two of the linear operators SA^fdx^, say those 
at the points x and x' in space-time. We have from (30) 

p-d^x) 8A v (x')l _ ^ 0 2 A(x—x') __ 0 2 A(x—x') 

dXp ‘ dx v J V d Xfl 8x'„ ~ 3llv dx^dx,, 

= -□A(x—x') = 0 

from (11). Thus the conditions (45) are all identically satisfied, so our 
supplementary conditions are not too restrictive. 

We should verify also that the supplementary conditions are con¬ 
sistent with the equations of motion, in the present case with the first 
of equations (13). This consistency is immediately evident in the 
quantum theory, as in the classical theory. 

Since the second of equations (13) is not valid and has to be replaced 
by a supplementary condition, any consequences of this equation in 
the ordinary Maxwell theory will not be valid in the quantum theory 
and will have to be replaced by supplementary conditions. The 

equations div# = 0, oMiot = -curl«S (46) 

follow simply from the equations defining 8 and M in terms of the 
potentials, namely (35), and are therefore valid also in the quantum 
theory. The other Maxwell equations for empty space, however, 

namely div£ = 0, 8S/8t = curl*#, 

or SF^JdXp — 0, 

can be derived only with the help of the second of equations (13), as 
one sees at once if one substitutes for F^ v its value given by (35), and 
are thus not valid in the quantum theory. They must be replaced by 

{div<§}|> = 0, {0$/3f-ourl.#}|> = 0, (47) 

holding for any j> corresponding to a state that can actually occur. 

The field quantities £ and M at any point in space-time commute 
with all the operators in the supplementary conditions, since from 
(35) and (30) 

\ v 8A x (x’)] _ \8A v (x) 8A,(x), 8A x (x')] 

|VW> 8x ^ J - 8xfl 8xV » 8x ’ x 

0 2 A (x—x') 0 2 A(x-x’) 0 2 A(x—x') 0 2 A(X—x') = n 

— 9rX 8xiW x ^ 8x v 8x x ~ 8x> i dx' v Sx^xF 

It follows that if £ or Jt is multiplied into a ket satisfying the 
supplementary conditions, it will give another ket satisfying the 
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supplementary conditions, and hence it fulfils the new requirement 
for being an observable. The potentials do not satisfy this require¬ 
ment. 

By making a Fourier resolution of the left-hand side of equation 
(43) we get the equations 

£M k/ J> = 0, te*. ! k/ J> = 0 (48) 

holding for all values of the 4-vector k with k^k^ — 0 and k 0 > 0. 
This is another form for the supplementary conditions. The P.B. of 
the operators k^A kfl and k^A k[Xt here, of course, vanishes, as may be 
verified directly from (23) or (27). 

To examine the consequences of equations (48), let us work with 
discrete k-values and consider first one particular k-value for which 
k x — k 2 = 0, =' k 0 > 0, as we have done on previous occasions. 

For this k-value equations (48) become 

(^ k0 ^ k3 )|> = 0, |> = 0. (49) 

Multiplying the first of these on the left by (A k0 -j- J. k3 ) and the second 
by (j4 k o~j“-d. k3 ) and adding, we get 

(^k0^k0~fr-^k0^k0 ^k3^k3 ^k3 x ^k3)i) ~ 0 

or 2(A kQ A k0 ^k 3 ^-k 3 )l) == 0 

with the help of (22). This shows that the energy in the two longi¬ 
tudinal degrees of freedom for this k-value, namely expression (39), 
vanishes for any state that occurs in nature. The same result holds 
for all k-values. Thus the supplementary conditions ensure that the 
negative energy in any A m degree of freedom is always exactly cancelled 
by the positive energy in the corresponding A ktd degree of freedom. 

Let us express the |> in (48) in the form 

I) = f > 

where ) F is the standard ket for the field of radiation introduced in 
the preceding section, corresponding to zero energy in each degree of 
freedom, and $ is a power series in the operators A kl , A k2 , A kZ , A kQ . 
Since A k0 ) F = ^k 3 >jr — 0, we get from (49), for the k-value to which 
these equations refer, 

{A k Qift tfsAj^Q^jp = (-^k3^A ^ '^k0*A/ , .F* 

With the help of the commutation relations (22), these equations 
reduce to 


^0 °k 


dlfs 


4tt 2 0A 


kO 


y F — - 4 k3 if/y F , 


hkft 

4tt 2 


k j¥ 
dA 


k3 
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showing that ifs is of the form 


where \fj ± is independent of A k0 and A kz . Applying this argument to 
all k-values, we find that iff is of the form 


ijj = e i7T * 5 A ^k o sk ^ ( 5 Q) 

where x involves only the transverse components of A k . In terms of 
continuous k-values 


= 


e 47r 2 |^ko AkKlhk 0 .d 3 k 


X- 


(51) 


We see in this way that the supplementary conditions fix the form 
of the wave function iff so far as concerns the longitudinal degrees of 
freedom. Thus the longitudinal degrees of freedom cannot play any 
important role in the dynamical theory. This corresponds to their 
not being of physical importance. Their only purpose is to give the 
theory a relativistic setting. The important part of ifs is the factor x 
referring to the transverse degrees of freedom. This factor is the 
same as the wave function in the theory of a field of radiation without 
interaction with matter given on pp. 240-2. 


78. Classical electrodynamics in Hamiltonian form 

The foregoing theory must now be extended to take into account 
the interaction of the field of radiation with matter. This involves 
dealing with the dynamical system composed of a number of charged 
particles interacting with the electromagnetic field. Let us first con¬ 
sider this dynamical system classically and see how to put its equa¬ 
tions of motion into Hamiltonian form. We shall then have a basis 
from which to build up a quantum theory by analogy. 

Each of the charged particles will describe a world-line in space- 
time in the classical theory. We give the particles labels and 
denote the coordinates of a point on the world-line of the ith particle 
by These coordinates are functions of the proper-time s € of the 
^th particle, this proper-time being defined so that its difference for 
two neighbouring points on the world-line satisfies 

ds\ = (dz^dZi), dz oi /ds i > 0. (52) 

The velocity 4-vector of the £th particle is defined by 

v i = dz i jds i ' (53) 

and satisfies from (52) 

(v*,v*) = 1, 
u 


3595.67 


> o. 


(54) 
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The presence of charges changes the Maxwell equations (13) to 

= 4t rjp, d'S/p/dXp = 0, (55) 

where is the 4-vector whose time component is the charge-density 
and whose space components are the current density. For mathe¬ 
matical simplicity we suppose the charge on each particle to be con¬ 
centrated at one point. Then vanishes everywhere except on the 
world-lines of the particles, where it has singularities which can be 
described in terms of 8 functions. The solution of (55) can be written 
in the form 




(56) 


where A^ in are the potentials of the incoming field of radiation which 
acts on the particles and are the retarded potentials of the ith 

particle, the summation in (56) being over all the particles. The 
potentials A^ satisfy the equations for no charges, equations (13), 
and the ^ ivet are given by 

^et(x) = e^/(v i; x-z i), (57) 

e i being the charge of the ith particle, and the variables v*, z i in (57) 
being taken at the retarded proper-time of the ith particle, for which 


(x-z*, x z$) = 0, x Qi Zqi > 0. (58) 

As the equations of motion for the ith particle, we shall take 
Lorentz’s equations 


'M'idv^ds^ vetvet > (^) 

m { being the mass of the ith particle, F^ and being the fields 

derived from the potentials A^ in and A M - iTeb in accordance with (35), 
and being similarly the field derived from the advanced 

potentials ^ i>adT given by (57) and (58) with the inequality in (58) 
reversed. The field functions on the right-hand side of (59) are all 
to be taken at the point x = where the ith particle is situated. 
The summation in (59) is over all the particles except the ith and 
shows that all the other particles act on the ith through their retarded 
fields. The fields i^ w>ret and F^^ are infinitely great at the point 
x = Zi, but their difference is finite, and this difference occurring in 
(59) gives the effect of radiation damping on the motion of the 
particle.f 


f For a derivation of Lorentz’s equations in the form (59) and a discussion of their 
validity and consequences, see Dirac, Proc. Roy. Soc. A 167 (1938), 148. 
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Our problem now is to put the equations of motion (59) into the 
Hamiltonian form. Let us first discuss in general terms what we 
should expect the Hamiltonian form to be like in a relativistic theory. 
We should not keep precisely to the form (14) or (15) of § 28, since 
this puts the time on a different footing from the space coordinates. 
We should expect to have the proper-time appearing as independent 
variable, and since each particle has its own proper-time we must 
then have several independent variables. Each dynamical variable | 
is thus in general a function of the proper-times s i of all the particles 
and has a value only with respect to a particular point on the world¬ 
line of each particle. The general concept of a P.B. satisfying the 
laws (2)-(6) of § 21 can be retained in a relativistic theory. We shall 
need one Hamiltonian for each particle, the relativistic Hamiltonian 
G i of the ith particle determining how dynamical variables vary with 
the independent variable s i9 according to the equation 


dtjds, = & cy. (60) 

In order that the various equations (60) for different i shall be con¬ 
sistent they must make 

d^jdSi dsj = d 2 HdSj ds i} 

which requires that 

\L£,e,\,W = lL€.W\ 

or [[G i} Gj], i] = 0, (61) 


from (6) of § 21. This must hold for any dynamical variable so w© 
must have [<3^ #.] = a number. (62) 


Equations (60) and (62) give the general Hamiltonian form of the 
equations of motion in a relativistic theory of several particles. 

Let us consider the dynamical variables for our system of several 
charged particles interacting with the electromagnetic field. The four 
coordinates of the ith particle will provide four dynamical vari¬ 
ables, the time coordinate being treated on the same footing as the 
three space coordinates. The four components p^ of the momentum- 
energy 4-vector of the ith particle will provide more. As the obvious 
generalization of the P.B. relations between coordinates and momenta 
in non-relativistic dynamics, we assume 


[Zjjii’ Z vj] — 0’ [P/xtfPyj] — z vi\ 
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The variables z^, should depend only on the proper-time and 
should be independent of the proper-times s, (j ^ i) of the other 
particles, so from (60) we must have 

[V. Q A = °’ lP»i> G i] = ° O'#*)- (64) 

We need also dynamical variables to describe the field. We take 
these to be the potentials A^{x) at all points in space-time. The 
4-vector x here should be looked upon as a parameter labelling these 
dynamical variables, there being four of them for each x. Each of 
these dynamical variables A^x) is a function of the proper-times 
Thus all the M^x) variables together provide a set of potentials 
throughout space-time depending on a point on the world-line of each 
particle. These potentials are therefore not the same as the Maxwell 
potentials ^(x) satisfying (55). We shall call them the Wentzel 
potentials .f They are closely related to the Maxwell potentials, as 
will appear later. 

Since a particle variable and a field variable refer to different 
degrees of freedom, their P.B. must be zero, i.e. 

x )] = 0, O^^X)] = 0. (65) 

We need also the P.B. of two field variables. A value for this P.B. 
is provided by the theory of radiation without interaction with 
matter, namely by equation (30) considered classically. This equation 
as it stands, however, is not a satisfactory one to use when there are 
charged particles present, as it causes certain infinite terms to appear 
in the equations of motion of the particles. One must replace it by 

[J /I (x),^ v (x')] = i^{A(x—x'4-X)+A(x—x'-X)}, (66) 

where X is a small 4-vector lying within the light-cone, i.e. 

(X,X) >0, (67) 

and is ultimately to be made to tend to zero. One must not make 
X 0 too early or one will get infinite terms appearing in the equa¬ 
tions. With finite X the theory is not relativistic, as the direction of X 
provides a preferred direction in space-time, but it will be found that 
as X -> 0 the equations of motion become independent of the direction 
of X, so long as (67) is satisfied, so that in the limit the theory is 

t These potentials were first used to give Lorentz’s equations of motion by Wentzel, 
Z. f. Physik, 8 6 (1933), 479. 
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relativistic. Equations (63), (65), and (66) give the P.B.s of all our 
dynamical variables. 

We must now set up the Hamiltonians. We shall assume that 

°i = 2^-W-(P i -e i A(z i ), 3 ) i -e i A(z i ))} (68) 

and shall verify that these Hamiltonians lead to the correct equations of 
motion. Let us first test for consistency. We find from (63), (65), and 

( 66 ) that r a a.! _ a i aq\ 




provided the conditions 


< 0 {i^j) (70) 

are fulfilled. These conditions mean that the independent variables 
s i are not completely independent, but must be restricted so that the 
points which they specify on the world-lines of the various particles 
each lie outside the light-cones with the others as vertices (and remain 
so when shifted by the amount ^X). Subject to these conditions the 
equations of motion are consistent. The dynamical variables should 
now be considered as undefined for values of the ^ which do not 
fulfil (70). 

Let us consider now the equations of motion. We see at once that 
equations (64) are satisfied. Putting £ = in (60), we get 

V = dsl ~8^ == (71) 

which is the usual relation between velocity and momentum for a 
charged particle. Prom (54) and (68) we see now that 


Equations (69) show that the G i are all constants of the motion and 
(72) shows that we must take these constants to be zero to get the 
equations of motion that we want. Putting f = p^ in (60), we get 


dp r ,i __ ZGi 


H L UU/ J X=Si 


which reduces, with the help of (71), to 


n, dv l£.= e-4—-^1 

dSi 1 idxi* 8x v \ x ^ s . 
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This would be the same as Lorentz’s equation (59) if we could arrange 
to have 

A/x) = A M j n (x)+'2A^ ret (x)+}A^ re t(x)~iA^ adv (x) (74) 
for x in the neighbourhood of z^. Finally, putting f = A^(x) in (60), 
dA^x) = l±{f.- e .A v (z i ))[A ll (x) > A v (z i )'] 

dSi 

= Je i ^{A(x-z i +X)+A(x-z i —X)}. (75) 

These equations for all i can be integrated to give 

\(x) = 2¥i f ^{A(x-z^+X)+A(x-z;-X)}d 5 ;+a fl (x), (76) 

i 

where v^, z\ are short for v^), z^), and a^x) is a constant of the 
motion for each x and /z,, i.e. it is independent of the Equation (76) 
shows the form of the Wentzel potentials A^(x) as functions of the s { . 
These equations, it should be remembered, hold only for values of the 
s i satisfying (70); for other values of the s i the Wentzel potentials are 
undefined. 

In order to see the significance of (76), let us study the integral 
J‘ ds’i. (77) 


If the point x lies inside the future part of the light-cone of z t at the 
proper-time s i} i.e. if 

(x-z*, x-z*) > 0, x 0 -z 0i > 0, (78) 

then (77) vanishes, since the A function vanishes throughout the 
domain of integration. If the point x lies outside the light-cone of 

Zi ’ 1 ' e ‘(X-Z „ x-z { ) < 0, (79) 

there is just one value of ^ in the domain of integration for which 
the A function does not vanish, namely the retarded proper-time for 
the field point x. The integral (77) is then equal to, with the help 
of (6), 

p S { 

J VA(x-zy ds t = 2 J v' fli 8(x-z-, x-z-) ds- 


O f <iS(x-z;x-zO 

“ J dix-zlx-z'JIds'P* 
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where p is a positive number. The integral now becomes 


co 

J (v-, x-z'i) 8 ( x_z * x ~ z ^ d ( X ~ Z 'i’ X ~ Z 'i) = 

taken at the retarded proper-time. Thus from (57) 

Si 

e i J v'^ A(x-z-) ds’i = ^ Mret (x). 


(V^ X Zj) 


If the point x lies inside the past part of the light-cone of z*, i.e. if 
(x—z* x-z,) > 0, tf 0 -z 0£ < 0, (81) 

there are two values of for which the A function does not vanish, 
namely the retarded and advanced proper-times. The contribution 
of the retarded proper-time to the integral (77) is the same as in the 
preceding case; the contribution of the advanced proper-time may 
be worked out by the same method and is, when multiplied by e u 
^z J adv( x )* Summing up our results, we have 

Si > 

e i J Vpt A(x—z^) ds'i = 0 when (78) holds, 

(82) 

= ^, re t( x ) when (79) holds, 

= ^ret( x )~^i,ady( x ) when (81) holds .) 

Substituting the results (82) with x^X for x into (76) we find, for 
x very close to z i (close compared to X), taking into account (70) and 
(67) and taking A 0 > 0, 

•^/a( x ) == 2 !{^j,ret(X -j- X) -f-^/x;,ret( x 
j&i 

_ hi' t ^/x'i,ret( x -M i' t ^u,adv( x ^)~T^( X )* 

If we take a^x) = A^x), (83) 

this agrees with (74) in the limit X = 0. Thus the choice (83 ) for the 
constants of the motion a^x) —a choice which is permissible since 
neither side of the equation depends on the s i — results in the equations 
of motion for all the particles becoming the Lorentz equations in the limit 
X = 0. 

The ingoing potentials A ^ must satisfy the equations (13) but are 
otherwise undetermined. Thus the constants of the motion a^(x) 
must satisfy 


□ a^(x) = 0, da^W/dXp = 0 


( 84 ) 
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but are otherwise arbitrary. Inserting these conditions in (76) we 
find, with the help of (11), 

OA [i (x) = 0, (85) 

= V \ &i f v/-{A(x-z;+X)+A(x-z--X)} ds\ 

8x n V J Sa > 

^ —oo 

Si 

= ~2^ e i J V^y{A(x-z;+X)+A(x-z--X)}ds; 

—oo 

S{ 

— — ~^i e t j* ^<{A(x— z i _ f'^ - ) _ t'A(x— Zi —X )}dsi 

2|e,{A(x-z 4 .+X)+A(x- Zi -X)}. (86) 

i 

The work of this section can be summed up as follows. To describe 
a number of charged particles interacting with the electromagnetic 
field we need the dynamical variables pA^x) satisfying the 
P.B. relations (63), (65), (66). The equations of motion then take the 
Hamiltonian form (60) with the Hamiltonians G i given by (68), pro¬ 
vided one imposes certain conditions on some of the constants of the 
motion, namely the G ?s must vanish and equations (85) and (86) 
must hold. 

The equations (85) and (86) for the Wentzel potentials A^ should 
be compared with the equations (55) for the Maxwell potentials 
Of the two equations (13) satisfied by the electromagnetic potentials 
in the absence of charges, the first gets modified by the presence of 
charges in the case of the Maxwell potentials and the second in the 
case of the Wentzel potentials. For a field point x lying outside the 
light-cone of all the electron points z i} each of the integrals in (76) 
is given by (80) and the right-hand side of (76) becomes equal to the 
right-hand side of (56) in the limit A = 0. Thus for this domain of x 
the Wentzel and the Maxwell potentials are equal. 

79. Passage to the quantum theory 

Let us now construct a quantum theory analogous to the classical 
theory of the preceding section. We use the same dynamical variables 
as before, namely the particle coordinates and momenta p^ and 
the Wentzel potentials A^x), and assume them to satisfy quantum 
conditions corresponding to their having the same P.B.s as in the 
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classical theory, given by (63), (65), and (66). The classical Hamil¬ 
tonians (68) should be replaced by Hamiltonians of the form given 
in the preceding chapter, applying to particles with a spin \ti, in order 
to get satisfactory relativistic wave equations. Thus we must intro¬ 
duce further dynamical variables to describe the spins. For the ith 
particle we need the spin variables m ri (r = 1,2,3) and a mi , which 
anticominute with one another and have their squares equal to unity, 
and which commute with all the z^, and A^(x) variables, and 
also with the spin variables of the other particles. We can then set 
up Hamiltonians of the form of the operator in (9) and (10) of § 67, 

( H = 2>oi-«i-4 0 (z<)+(«i. Vi~eiA Zi )+a mi m< (87) 

to replace the classical Hamiltonians (68), A z . being written instead 
of A(z £ ) in the three-dimensional scalar product. 

We describe a state of motion of the whole system of particles and 
field by a wave function in the coordinates and times z^ { of the 
particles, which wave function is a ket in the other degrees of freedom, 
i.e. those of the field and of the spins of the particles. Following the 
notation of the end of § 20, we write this wave-function-ket as |z>. 
It must satisfy the wave equations 

G. c |z> = 0 , (88) 

which may be looked upon as supplementary conditions correspond¬ 
ing to the classical equations (72). For the various equations (88) to 
be consistent u r e need, by an application of (45), 

[G i ,G j ]\z} = 0, (89) 

a rather more stringent condition than the classical consistency con¬ 
dition (62). With the Hamiltonians (87), [G t , Gj] = 0 when (70) holds 
and the condition (89) is then satisfied. The conditions (70) can be 
brought in by supposing that |z> is defined only for values of the 
z-variables satisfying (70), so that it is only in this domain of defini¬ 
tion of |z> that equations (89) have to hold. The wave equations 
(88) are consistent in this domain. 

The remaining equations of the classical theory, equations (85) and 
(86), must now be taken over into the quantum theory. Equation 
(85) may be assumed to hold unchanged in the quantum theory, as 
it does not give rise to any inconsistency because its left-hand side 
commutes with all the dynamical variables. Equation (86) must be 
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replaced by a supplementary condition, as otherwise it would lead 
to inconsistencies. Defining i?(x) by 

•R(x) = 8A ll (x)l8x ll + 2 ief{A(x—z<+X)+A(x—z f —X)}, (90) 

i 

we take the supplementary condition 

J2(x)|z> = 0 (91) 

holding for all x as the quantum analogue of the classical equation 
(86). It is a generalization of the supplementary condition (43) for 
no charges. We have, using (66), 

= -8R(x)l8z^~e i [R(x),A fl (z i )] 

= -i|ei^{A(x-z i +X)4-A(x-z i -X)}~ 




- 0, (92) 

so that from (68) [ii(x), $$] = 0. (93) 

Thus the supplementary conditions (88) and (91) are consistent. 
Again 


[i?(x), i?(x')] 


’ dA^x) SAJX)' 
_ dx^ 3 cx v _ 


^ 0 ^iM A ( x - x '+ X) + A ( x - x '- x )} 

= —§ □ {A(x— x'+X)+A(x— x'—X)} = 0 (94) 

from (11), so that the various supplementary conditions (91) obtained 
by putting different values for x are consistent with one another. 

We now have the complete scheme of quantum equations corre¬ 
sponding to the classical theory of the preceding section, namely the 
P.B. relations (63), (65), and (66) together with the equations (85), 
(88), and (91), and have verified that they are all consistent for the 
domain of the z’s for which (70) holds. If some of the particles are 
of the same kind and are bosons or fermions, the further conditions 
must be imposed that |z> is symmetrical or antisymmetrical, as the 
case may be, between the coordinates (and spin variables) of the 
similar particles. 

The wave-funetion-ket |z>, if normalized, has the physical inter¬ 
pretation that <z|z> is the probability, per unit three-dimensional 
volume for each particle, of each particle being in the neighbourhood 
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of the place fixed by its coordinates z u ,z 2i ,z 3i at the time z 0£ . The 
theory allows one to calculate this probability, for any state of 
motion of the system, only provided the conditions (70) are satisfied, 
which means, in the limit X = 0, that the points z £ in space-time must 
each be outside the light-cones of the others. The observations of 
whether the particles are at the places z 1{ , z 2£ , z 3i at the times z oi are 
thus compatible observations only provided the points z £ in space¬ 
time are outside each other’s light-cones. This result of the theory is 
to be expected on general physical grounds, since the observation of 
whether a particle is at a particular place at a particular time may be 
expected to produce a disturbance throughout that region of space- 
time lying inside the future light-cone of the particular place and time. 

Equation (85) enables us to resolve the potentials into their Fourier 
components according to 

--V<x) = J {A kfi e'-o™>+Z kfi d z k (95) 

with Jc 0 = |k|, (96) 

as in the ease of no charges. The Fourier coefficients A k no longer 
satisfy the commutation relations (27) on account of the occurrence 
of X in (66). They still satisfy (28) and instead of (27) they satisfy 

A^'j/Ay.p .HUcq cos( kX)§3(k k ), (97) 

as may be verified by noting that (28) and (97) lead to equation 
(29) with the extra factor cos(kX) in the integrand and this extra 
factor makes equation (29) lead to (06) instead of (30). 

It is convenient to redefine for those values of k for which 
cos(kX) is negative so that 

new A k(JL = -old A„ k[x . 

Thus the new Fourier coefficient A k{1 exists when k 0 cos(kX) > 0. 
With X very small, the redefinition affects only Fourier coefficients 
with very large k-values. Witli the new A k/x equation (95) still holds 
if (96) is replaced byf 

Jfc 0 = |k||cos(kX)|/cos(kX) (98) 

| If X does not lie along the time axis there are some regions of (/c x 7c a & 3 )-space for 
which there is no A* ( , satisfying (98) and others for which there are two. The integral 
(95), and similar integrals in the future, are then to be understood as taken over the 
domain of (k t /c 3 )-space for which (98) has a solution and as summed over both 
values of the integrand for that part of the domain for which (98) has two solutions. 
From the four-dimensional point of view, the domain of integration is that part of 
the light-cone (kk) = 0 for which k 0 cos(kX) > 0, and is Lorentz invariant for a 
given A. 
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and (97) holds unchanged. The right-hand side of (97) with k = k' 
is now always positive for p, == v = 1, 2, or 3 and negative for 
fx = v = 0. This enables us to express any ket in the degrees of 
freedom of the field as a power series in the variables 4 kl , 4 k2 , 4 k3 , 
and 4 k0 multiplied into the standard ket corresponding to no 
energy in each of the degrees of freedom, as we had at the end of § 76. 
Expressing the wave-function-ket |z> in this way, we have 

|z> - <A> f _ (99) 

where ^ is a power series in the variables A kv A k2i A kz , A k0 , whose 
coefficients are each a wave function in the z-variables and a ket in the 
spin degrees of freedom. These coefficients correspond to there being 
different numbers of photons in the various degrees of freedom of the 
field. 

80. Elimination of the longitudinal waves 

The electromagnetic field in the foregoing electrodynamical theory, 
both classical and quantum, involves longitudinal waves as well as 
transverse ones. The potentials A^x) may be expressed as 

-V*) = ^(x)+i^(x), (100) 

where I^(x) are the potentials of the longitudinal waves and M^x) 
those of the transverse waves. The longitudinal waves are made up 
of the components 4 k0 and A kK of the Fourier component 4 kiUJ as 
discussed in § 76. Here 4 kK is the component of the three-dimensional 
vector 4 kr (r = 1,2, 3) in the direction of the three-dimensional vector 
k rJ so that, expressed as a three-dimensional vector, it equals 
(kA k )& r £y 2 . Thus 

L 0 (x) = 4 0 (x), 

4(x) = J {(kA k )e i 0“>+(kX k )e-«^)}* r i : -3 d 3 k. (101) 

These equations fix the longitudinal part of the potentials, and the 
transverse part is then fixed by (100), i.e. 

if 0 (x) = 0, M r (x) = A r (x)—L r (x). (102) 

The longitudinal waves are not physically important. They can 
be e li mi nated from the equations by a certain mathematical trans¬ 
formation, which forms a generalization of the method which led to 
equation (51) for the case of no charges. The equations are thereby 
simplified and brought into more direct connexion with experiment, 
but they lose their relativistic form, as the separation of the field into 
longitudinal and transverse waves is not Lorentz invariant. 
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By making a Fourier resolution of the left-hand side of equation 
(91) we get, with the help of (10), the equations 

■cos(kX)/47r 2 . T e i e- <(kz i ) }|z> = 0, 

i 

eos(kX)/4v 2 . T e t e 1 ^}[z> = 0, 

forming the generalization of (48). If for the moment we take discrete 
k-values, the commutation relations (97) become, from (26), 

^k/u^kV ^ * ^^0 COS(kX)<S k § kk ', (104) 

and show us that, with the notation (99), 



a j.\ _ M 0 cos(kX),s k difj j M 0 eos(kX)s k di/, x 

” ..^ = — in 2 — nz >r - 

(105) 


Thus equations (103) become, on multiplication by 477 2 /cos(kX), 


ii 7-2 « \ 

" /l 0 ,s k o J /*’' 
^kO 


■«*0 


2+! ^>5k) ^ k ° - ? e ^ i) Y> F - °- 

r 

These equations holding for all k show that i/j is of the form 

<A - ( 106 ) 

where 


[ 477 2 

(eos(kX) 

( 477 


(kA k )+ 2e ie -^U) F = 0, 


S = 2 /^o 2 ^ 1 {4 7 r 2 (kA k )ri k0 /cos(kX) + 

+ 2 44oV(kA k )e« k *<>]} 

i 

and xi is independent of ri kK and ri k0 . Passing hack to continuous 
k-values, we find that </j is still of the form (106) with S given by 


S = J {4^(kA k )d k0 /cos(kX) + 

+ 2 44o e ~ i(W —^ k 4)e itei> ]}4 2 d 3 k. (107) 

V 

Thus, as in the case of no charges, we find that the form of the wave 
function ifj is fixed so far as concerns the longitudinal degrees of free¬ 
dom. The important part of yfr is the factor X i, which involves only 
the transverse components of A% r} together with the z s and spin 
variables. 

We may look upon X i as a wave function from which the longi- 
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tudinal waves have been eliminated. We can obtain wave equations 
for xi hi the following way. We have 
p^ e sln = eWp^+idS/dzf 

= eMp^+e,. | ^[4 e-^+KHkA k )e«^>]V d*k. (108) 
Using this result for ^ = 0, we get 

{Poj—ejLoiZj)}^ 

= {p oj -ej J [A^e^+A w e-^]K X d*k}e s ^ X i>F 
= e sln P 0i Xi>F+ 6 i / [(kA k )~l 0 A k0 ]e^k^ d *k e«*Xi>* 

= 1 % f cos(kX)e« k ^-^o~ 2 ^ k 

with the help of the first of equations (103). Again, using (101) and 

(108) with fx = r, we get 

{p rj —e. j L r {Z i )}rj>') F 

= {j^-es, J [(kA k )e^>+(kA k )e-^>]^fco 3 d s k}e^ Xl ) F 

= e s '%-Xi>P+e j j [/i 0 2 k o-(kA k )]e- (k ^ r ^ 3 ^ k e sw Xi>jr 

= e«/% Xl >,+e # /4n*. J a, f cos(k d * k e m Xi>F 

with the help of the second of equations (103). These equations may 
be combined as 

{2V-e* L^ z M>f = e ; £ 4 z ;)}Xi>e> ( 109 ) 

where 

5 0 (x) = 1/4 tt 2 . 2«i f cos(kX)e i < k > !C - z <>h(T 2 ^ 3k > 

i •* 

JS r (x) = —1/4 tt 2 . 2 e* f cos(kX)e- i(k ' x - z ‘ ) * r fc 0 - 3 d 3 k. 

i J 

The equations may be simplified by a further transformation, namely 

Xl = * m X> ( 110 ) 

where 

T = — 1/8tt 2 . J e f e 3 - f cos(kX)cos(k,z*~Zj.)^ 3 d s k. (Ill) 
ij •* 

Equations (109) go over into 

{p /i3 -e ; i /1 (z f )}^> F = e(S+^{p fl3 -e 3 .J5 fl (z,.)+ieT/S Z J‘} x > J , 

= e^+^rtyz^x)*’, 


( 112 ) 
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where 

^( x ) = i/4-TT 2 . 2 e* J cos(kX)sin(k, x—z^k^k^ d z k 

= ±1/ 4?t2 ' ^ «»• | cos(kX)cos(k, x—z^kp &q 3 <Z 3 k, (113) 

the + or — sign being taken according to whether {jl is zero or not. 

With the help of (100) and (112), the wave equations (87), (88) go 
over into 

{Poj- 1 eA( z i) + («j. Pj~ e j Kr e -j M zi)+ a mj m j}x>F = 0- (114) 
The variables describing the longitudinal waves have all disappeared 
from these equations. We may take % as the wave function for the 
theory in which the longitudinal waves have been eliminated (it is 
rather more convenient for this purpose than xi), and equations (114) 
are the wave equations which it has to satisfy. The influence of the 
longitudinal waves now shows itself up through the functions b^(Zj) 
of the particle variables appearing in the Hamiltonians. The supple¬ 
mentary conditions (91) have been satisfied through our using (106), 
and drop out of the present formulation of the theory. 

To work out the function b^x) we must evaluate integrals of the 

form 7 m (x) = Jcos(kx)^* 0 -»#k (115) 

for a general 4-vector x, with k 0 given by (98). Since the integrand 
in (115) is unchanged when — k is put for k, the integral is equal to 

I„(x) = f cos(kx)A fl Icq s #k, 

ICq ^ 

where Y means summing over both values ± |k| for k 0 . Thus i"^(x) 

kg 

equals /^( x ) = 1 J A(k)cos(kx)& M &o 2 d 4 k. 

This integral may be evaluated most conveniently from formula (10), 
which gives us, on taking the real part of both sides, 

| J A(k)sin(kx) #k = 2 tt 2 A(x) 

= 27 r 2 |x|- 1 { 8 (a: 0 -|x|)-S(a: 0 -Hx|)}. 

Integrating both sides here with respect to x 0 , we find 

I 0 (x) = | f A(k)cos(kx)& 0 - 1 #k = 0 for (xx) > 0, | (n6) 

= 2ir 2 |x|- 1 for (xx) < 0, j 

the constant of integration being fixed by the condition that I 0 (x) 
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vanishes for # 0 -> ±oo with x v z 2 > x z fixed. Integrating (116) with 
respect to x Qi we find 

i | A(k)sin(kx)^ 2 d *k = -2n 2 for (xx) >0, z 0 < 0, 

= 2r 2 a; 0 |x|- 1 for (xx) < 0, 

= 2 tt 2 for (xx) >0, x 0 > 0, 

the constant of integration being fixed by the condition that the 
integral vanishes for x 0 = 0. Differentiating with respect to x r , we get 


I r (x) = 1J A(k)cos(kx)4,^ 2 d 4 k 
= 0 for (xx) > 0, 

= 2n 2 x 0 x r \x\- 3 for (xx) < 0. 



Using the results (116), (117) in (113), we get, with reference to (70), 

6 ° (Zj) ~ * e *{|z 3 —Zi+Xj + |z 3 — z t -\\) 5 

A trw ^ ^ V a ffeoj ~Oi“b^o)(^rj ^ri'b'V) i 

W fe-vFxp + 


(^0j ^Ql Ap) \Z r j Zrf Ay) 


|z,-z,-X| ; 


)• 


(118) 


The terms i = j in the sums are zero on account of (XX) > 0. These 
terms would have been infinitely great if we had put X = 0 in (113), 
so we see here the need for not passing to the limit X -> 0 too early 
in the theory. However, it is permissible to put X = 0 in (118), so 
we may take 


Kih) = i2 e il\ z i~ z i\. 


b r (Zj) — 2 2 e i( Z 0j z 0i)( Z rj z ri)l l Z j~~ z il 3 * 


(119) 


The relativistic form of the theory has been spoilt by the elimina¬ 
tion of the longitudinal waves. There is now not much point in 
retaining different time variables z oi for the different particles. By 
putting all the z 0 ’-s equal to t we can get a further simplification of the 
equations. We have in the first place b r (zj) = 0. We can write the wave 
equations (114) as ^a x /3 % = E jX , 

Hj = e,«> 0 (2,)-(a,, p,-e,M Zj )- 


where 
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We then have 

<m 

Thus the wave function Xz Q =t satisfies one wave equation, in which 
the Hamiltonian is the sum of the Hamiltonians in the many-time 
formulation. 

The total contribution of the 6 0 (z i ) terms to the Hamiltonian 

2 H i is 

j 2«,& 0 ( *,) = 2WI^I- ( 121 ) 

j i<j 

This is precisely the Coulomb interaction energy. Thus the longi - 
tudinal waves get replaced by the Coulomb interaction energy in the 
single-time formulation of the theory. We can now see the real signifi¬ 
cance of the longitudinal waves of the Wentzel field. They are to 
enable one to bring the Coulomb forces into electrodynamics in a 
relativistic manner. 

A further transformation of the wave equation is of interest. Let 

usput T = (122) 

where H R is the Hamiltonian of the field in the absence of charges, 
given by (41), and let us consider Y as a new wave function. It 
satisfies the wave equation 

iMYIdt = (H b + % (123) 

3 

where jy* _ 

= e 3 .6 0 (z 3 )—(cq, 

with M*(x) = e- m * t l h M r (x)e m *‘l n . 

If we express M r (x) in terms of its Fourier components 

M r (x) = J {M kr e i(kx) + M& e d s k, (124) 

M kr being the part of the three-dimensional vector A kr perpendicular 
to k r , then we have, with the help of (42) and (1), 

M*(t,x lt x 2 ,x 3 ) = j (125) 

Thus M*(t, x v x 2 , x 3 ) is a function of the M kr not involving t, and 
is a constant linear operator. The Hamiltonian in the wave equation 
(123) is now constant, and the wave equation itself is of the usual 
form for an isolated system in non-relativistie theory. Further, the 

3595.57 
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Hamiltonian in (123) is just what one would get with the non- 
relativistie theory of § 62 if one takes for H P in equation (53) of § 62 
the proper-energy of a set of particles each with spin together 
with their Coulomb interaction energy. This rather surprising result 
means that the theory of § 62 applied to particles with spin \Ti and 
with Coulomb interaction energy is essentially a relativistic theory, 
leading to physical consequences which are invariant under Lorentz 
transformations, in spite of the form of the theory departing so much 
from the usual relativistic requirements. 


81. Discussion of the transverse waves 

Let us apply the theory of the preceding section to the case of a 
single particle. There is then just one wave equation (114) and the 
terms involving b drop out, so the wave equation becomes 

{Po+(«P)+ a m wl }x>j' = e (* M z)xV ( 126 ) 

This is the wave equation for a single particle interacting with the 
electromagnetic field. Let us try to get a solution of it on the 
assumption that the interaction term in the Hamiltonian, namely 
e(aMJ, is small. Such a solution would be of the form of a power 
series in the charge e, 

X = Xo+%+e 2 X2+..., (m 


where Xo> Xv X 2 >--- are independent of e. Substituting (127) in (126) 
and picking out terms of different degree in e, we get the successive 

equations ( 2 > 0 +(ap)+c^m}x 0 >*, = 0, (128) 

{p 0 J r(ap)+ct m m} Xl > F = (aMJxoV (129) 

{Po+ («P) +<* m m} X2 ) F = (aM z )xi>jp. (130) 

A solution of (128) corresponding to the particle having the energy 
and momentum p', with (p'p') = m 2 , and no photons present is 

Xo = s>, (131) 


where [s> is a bet in the spin degrees of freedom satisfying 
{po+(«P')+“«t»i}|s> = 0. 
Substituting (131) in (129) and using (124) and 

-3ffer/'F = 0, 


we get 


{Po+(aP)4-a B »ra}xi>F = J d 3 k|s»^. 


(132) 


(133) 
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To solve this equation for Xv we multiply both sides by the operator 
on the left, which gives 

{(pp)-™ 2 ki>*> 

= {Po ( a P) — <x m m } J d 3 k[s» F 

= J {p'o-^o-K p' &k)— a m rn}(aM k )e i( - k - p ' lh ’ x) k^ 1 <2 3 k | s> ') F . (134) 

The operator {(pp)—m 2 } applied to the integrand here is equivalent 
to the multiplying factor 

(-£k+p', —#k+p')—m 2 = ~2S(kp / ), 

and hence a solution of (134) is 


Xi = 


/ (kpTHiv 


-M 0 - (a, p'-4k) —a m rn}(aM k )x 

X -*'/*»*)&“ i d 3 k|s>. 

This xi is linear in the lf kr variables and corresponds to one photon 
being present. Substituting this Xi into (130), we see that X 2 is °f the 

Xa = X2 2) +X2 0) . 


form 

where 


bo+(«P)+« m ™M 2) >F = J & k' Xl >„ (135) 

{Po+( a P)+“m m }X2 0) >*’ = J (oM k .)e _i(k '* ) Ao- 1 d 3 k' Xl V ( 136 ) 


The right-hand side of (135) is quadratic in the M kr variables and 
leads to a quadratic xH\ corresponding to two photons being present, 
while (136) leads, as we shall see, to a ^ 0) independent of the M kr 
variables, corresponding to no photons present. 

The right-hand side of (136) contains terms of the form M k , r M ks ) F , 
so far as concerns the field variables. Such a term becomes, with the 
help of (133) and of the commutation relations (97), 

& k . r M ks > F = (M k . r M ks -M ks M k . r )) F 

= —? rs /4Tr 2 .tt 0 cos(kX)S 3 (k—k'))^ 

if r and s denote directions in three-dimensional space perpendicular 
to (k x k 2 k 3 ) and either equal or perpendicular to each other. Using 
this result, the right-hand side of (136) becomes 


— l/8ir 2 . JJ 2 “r{Po-^0-(«» P'— ; & k ) — 1 °^wK( k P')~ lc OS(k>.)X 

xe i(k-k'-p'/W 8 3 (k—k')&o -1 tPktPk'!«»*., (137) 

where the summation with respect to r refers to two perpendicular 
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directions for r which are both perpendicular to (k l Jc 2 kfj. The 
expression (137) reduces to 

— l/8TT 2 .e- i <P'^ h j J a T {p' 0 -&k 0 —(a, p'—^k)-a m m}a r (kp')“ 1 X 

X cos(kX)^ 1 rf 3 k|s»jp. 

This is a divergent integral since it contains, amongst other terms, 
one involving J (kp0 _i cos(kX) d s k> 

which diverges, with k Q given by (98), even before passing to the 
limit X -> 0. We can conclude that the wave equation (126) has no 
solution of the form of a power series in the charge e. This conclusion 
must hold also for the wave equation for several particles—the trans¬ 
verse electromagnetic waves always lead to divergent integrals when 
one tries to get a solution of the form of a power series in the charges 
on the particles. 

We have here a fundamental difficulty in quantum electrodynamics, 
a difficulty which has not yet been solved. It may be that the wave 
equation (126) has solutions which are not of the form of a power 
series in e. Such solutions have not yet been found. If they exist 
they are presumably very complicated. Thus even if they exist the 
theory w r ould not be satisfactory, as we should require of a satis¬ 
factory theory that its equations have a simple solution for any 
simple physical problem, and the solution of (126) for the trivial 
problem of the motion of a single charged particle in the absence of 
any incident field of radiation has not yet been found. 

Quantum electrodynamics has many satisfactory features in it, 
closely analogous to various features in classical electrodynamics. 
One can get from it finite and reasonable answers for problems con¬ 
cerning the emission, absorption, and scattering of radiation whose 
wavelength is not too short, by cutting off the divergent integrals at 
a value for |k| of the order 2^rm/e 2 , which cutting off means physically 
that the contribution of transverse electromagnetic waves of wave¬ 
length less than e 2 /m to the process under investigation is neglected. 
The wavelength e 2 /m is chosen for the cut-off because it is of the 
order of the classical radius of a particle of charge e and mass m on 
Lorentz’s model of the electron. The cutting off is not a relativistic 
procedure and can lead to well-defined results only for problems in 
which the important wavelengths are considerably greater than e 2 jm. 

It is probable that some deep-lying changes will have to be made 
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in the present formalism before it will provide a reliable theory for 
radiative processes involving short wavelengths. These changes may 
correspond to a departure from the point-charge model of elementary 
particles which provides the basis of the present theory. Already in 
the classical theory the point-charge, model involves some difficulties 
in interpretation and application, f even though it leads to well-defined 
equations of motion, as given in § 78, so it is not surprising that the 
passage to the quantum theory brings in further difficulties. 

t See Dirac, Proc. Roy. Soc. A 167 (1938), 148 and Eliezer, Proc. Camb . Phil. 
Soc . 39 (1943), 173. 
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